Skip to content
GitHub
Docs / Concepts / Security model

Security model

VERSIONv0.4.1 · SOURCE docs/security-model.md

Theodosia’s trust boundary sits between an agent’s tool calls and the action bodies that run them. The table below names what is trusted and what is not.

ComponentTrustWhy
The agent (the model)UntrustedIt is the thing being constrained. Its tool calls are inputs to validate, not instructions to obey.
The graph and action bodiesTrustedYou wrote them. They run with your process’s privileges.
Upstream MCP servers (upstream)Operator-configuredYou choose which servers an action may reach; they are not agent-selected.
A .app file, if usedTrusted localIt execs Python, so it is a local-authoring convenience, not a fetch-and-run format.
  • Reachability. The agent can only run an action the graph allows from the current state. An out-of-order or unknown action comes back as a structured refusal with the reachable actions; it never executes.
  • State integrity. State lives on the server, keyed per session. The agent cannot assert a state it is not in; every state change is the recorded result of a server-executed action.
  • Boundary validation. The step tool advertises each action’s declared input schema (types and Literal choices) to the caller, and where an action annotates a Pydantic model the dict is coerced into that model before the body runs. The server does not otherwise enforce scalar types or Literal values: a malformed value reaches the action body and surfaces as a structured action_error. Wire input_validators to reject bad inputs (as a validation_failed refusal) before the body runs.
  • An honest, verifiable record. Every attempt, including refusals, is recorded to the per-session history and Burr’s tracker, so the trail reflects what the agent tried, not its own account. Each attempt is also hash-chained into a ledger.jsonl next to the tracker log; theodosia verify <session> recomputes the chain and names the exact line where any entry was edited, reordered, or deleted after the fact. Entries carry the session’s app_id, project, and partition_key in the hashed payload, so copying the file to a different session directory breaks verification.

What the ledger proves and what it does not

Section titled “What the ledger proves and what it does not”

The hash chain catches in-place edits, reorderings, duplications, and middle-deletions of recorded entries. It does not catch:

  • Truncation. Dropping the tail-most entry leaves a chain that still self-verifies. Detect by external commitment of expected length, or by streaming each entry to append-only storage as it is written.
  • Whole-cloth forgery in the default (unkeyed) mode. The hash function is public, so anyone with write access to ledger.jsonl can mint a chain from scratch. Set THEODOSIA_LEDGER_KEY (hex-encoded bytes) in the server environment to switch the chain to HMAC-SHA256; forgery then requires the key.
  • Origin. A signature over the head would add non-repudiation. The chain alone does not.
  • Existence of any particular session. Deleting a whole session directory is invisible to verify; detect with an external session manifest.

For regulated audit trails (SOX, HIPAA, PCI, GxP), layer at least one of HMAC mode, periodic external checkpointing (RFC 3161 timestamp authority or transparency log), and an append-only object store with retention locks on top. ledger.jsonl alone detects accidental corruption and the naive editor; it does not survive a motivated insider with filesystem write access.

Per-session isolation requires factory mode: pass a () -> Application factory to mount(). Each MCP session then gets its own Application instance, keyed by FastMCP’s ctx.session_id. The store lives in a closure scope inside mount(), not in a module global, so two mount() calls in the same process do not bleed sessions into each other.

Mounting a built Application instance instead is shared-app mode: every session mutates one FSM, so there is no per-session isolation. On a multi-client transport (HTTP/SSE, where ctx.session_id can be null) connected clients then share one FSM and its state outright. mount() logs a warning in this mode, and fork_at / fork_from_past / reset_session refuse. Use a factory for any server more than one client can reach; reserve the instance form for a single stdio client or a deliberately shared FSM.

fork_from_past(app_id, partition_key=...) reaches into a different session’s persisted state through the persister. To prevent a tenant-A session from loading tenant-B’s state by guessing an app_id, the caller-supplied partition_key is compared against the session’s bound _partition_key (set by ApplicationBuilder.with_identifiers(...)). A mismatch returns a partition_mismatch refusal before the persister is reached, so a malicious tenant cannot even confirm whether the requested app_id exists in another partition. See tests/test_fork_from_past_partition_binding.py for the regression that pins this behaviour.

Releases publish a CycloneDX SBOM as a workflow artifact (.github/workflows/release.yml). Direct runtime deps are Apache 2.0 or MIT (no copyleft); pin uv.lock and mirror to your own index for an auditable build.

A typed action’s reads=[...] declaration is enforced at the action-body level via Burr’s per-action Pydantic input projection: the body cannot access state fields it did not declare. This makes patterns like Chain-of-Verification’s “independent verification” property structural between action bodies.

reads= is not a confidentiality boundary on the MCP wire:

  • The step response includes the full post-step state.
  • theodosia://state returns the full state unconditionally.

If your threat model is “the agent must not see field X”, filter the state in a separate layer (a state_loader callback, an MCP middleware, or by partitioning the FSM so X never lands in the same Application state). reads= is the contract between the FSM graph and the action body’s Python code; it is not the contract between the server and the LLM client.

  • It does not sandbox action bodies. Your action code runs with full process privileges; treat it as you would any code you ship. Theodosia controls when an action runs, not what your code is allowed to do.
  • It does not constrain reasoning inside a valid step. It removes structural failures (skipping, early termination, unverified success); it does not make a wrong-but-legal step right.
  • It does not authenticate the agent or the MCP transport. Run it behind your own transport security (stdio in-process, or an authenticated HTTP front).
  • It does not cap state size or depth. A 100MB state value or a 500-level-nested dict will be materialized in full in the step response and the tracker log. Add an action-body-level size check if the FSM may receive large external data.
  • It does not garbage-collect fork descendants. Every fork_at / fork_from_past creates a fresh app_id directory in the tracker storage root. max_sessions caps the in-memory session map; it does not cap on-disk fork directories. Reap them out of band.

Report vulnerabilities privately per SECURITY.md; do not open public issues for them.

VERSIONv0.4.1 · SOURCE docs/security-model.md