Understanding AgentVault Guarantees

The core developer question

Before integrating any coordination protocol, a developer needs to know precisely what it guarantees and what it does not. AgentVault is an agent-native bounded-disclosure coordination protocol. It constrains what can be emitted between participants by bounding the output channel with a contract, schema validation, and receipt-bound governance artefacts. It does not evaluate whether a signal is correct, truthful, or aligned with anyone's intent. In the software lane, deployments operate at SELF_ASSERTED, which proves signed provenance of declared rules, not execution integrity. The TEE lane raises this to TEE_ATTESTED. This page defines those boundaries, and distinguishes what is guaranteed by protocol structure from what still depends on the execution lane and trust model.

Layered responsibility model

AgentVault distributes responsibility across three layers. Each layer has a distinct role, and no layer substitutes for another.

The vault (relay + guardian) enforces the agreed output boundary and the declared session governance. It validates that the output conforms to the contract's JSON Schema, applies the guardian enforcement policy, signs a receipt binding all governance artefacts, and delivers only the bounded output to participants. The vault is a coordination primitive — it constrains the channel mechanically, without evaluating what the signal means.

Agents submit context on behalf of their participants and interpret the bounded signal they receive. Agents choose what context to submit (input minimisation is a participant-side decision) and how to act on the output. The vault does not govern what agents do with the signal after delivery.

Participants (humans) define the consent boundary by agreeing to the coordination contract before any context is exchanged. The contract specifies the purpose, output schema, referenced enforcement policy, and execution constraints such as model profile or model-selection requirements. Participants are responsible for schema design — which is a security decision, because the schema determines the channel capacity.

Architecture

The following diagram shows the trust boundaries and information flow in an AgentVault session. Note that the verifier sits adjacent to the execution path — it consumes the receipt, contract, policy, and public keys independently. Verification does not require the runtime, the operator, or the model.

                          Relay Operator Trust Boundary
                    ┌─────────────────────────────────────┐
                    │                                     │
  Participant A ──► │  ┌─────────────────────────────┐    │
    (agent)         │  │     AgentVault Relay         │    │──► LLM Provider
                    │  │                              │    │     (inference)
  Participant B ──► │  │  guardian ── schema ── sign  │ ◄──│
    (agent)         │  └─────────────────────────────┘    │
                    │              │                       │
                    └──────────────┼───────────────────────┘
                                   │
                        bounded output + receipt
                            │            │
                            ▼            ▼
                      Participant A   Participant B


   ┌──────────────────────────────────────────────────┐
   │              Independent Verifier                 │
   │                                                   │
   │  Inputs: receipt + contract + policy + public key │
   │  Does NOT require: runtime, operator, or model    │
   │  Output: signature valid/invalid + assurance level│
   └──────────────────────────────────────────────────┘

Verifier independence is the central result of the protocol's design. A third party can verify the declared governance artefacts and receipt commitments for a session without needing access to the relay, the operator, or the model that produced the output.

What AgentVault guarantees

Participant boundary enforcement. Neither participant receives the other's raw input from the relay. The relay routes inputs and outputs separately. The output that leaves the relay conforms to the JSON Schema specified in the coordination contract. Output that fails schema validation is rejected — it never reaches either participant.

Signal compression. The output channel is structurally narrowed. An all-enum output schema with a computed channel capacity of 25 bits can carry at most 25 bits through the bounded output channel, regardless of what the model attempts to produce. This is a structural property of the schema, not a property of the model's compliance. The relay enforces it mechanically — the guardian never evaluates whether a signal is correct, only whether the signal is allowed.

Cryptographic binding. Every completed session produces a signed receipt (Ed25519) that binds the session to specific governance artefacts and output commitments — including the contract hash, output schema hash, prompt template hash, guardian policy hash, model profile hash, relay build hash, bounded output hash, and per-participant input commitment hashes. A recipient with the relay's public key can verify that the receipt was produced by the relay and that none of the bound fields have been modified since signing.

Independent verification. Verification requires the receipt, the contract, the policy, and the relay's public key. It does not require the runtime, the operator, or the model. A verifier who holds these artefacts can independently recompute all commitment hashes and confirm they match the receipt. At SELF_ASSERTED, this proves provenance and consistency of declarations, not honest execution.

What AgentVault does not guarantee

Correctness. The protocol does not evaluate whether the bounded signal is correct. If the model produces "STRONG_MATCH" when the inputs suggest otherwise, the relay delivers the signal — it passed schema validation and the guardian policy. Correctness is a model quality question, not a protocol guarantee.

Intent alignment. The protocol does not guarantee that the signal reflects either participant's intent. The bounded signal is what the model produced within the schema constraints. Whether that signal serves the participants' goals is outside the protocol's scope.

Output truthfulness. A model can produce a schema-valid signal that misrepresents the relationship between the inputs. The protocol constrains what can be said (channel capacity), not whether what is said is true.

Relay honesty. At SELF_ASSERTED assurance level (the current implementation), the receipt proves the relay's stated rules were declared. It does not prove the relay followed them. A malicious relay operator could fabricate a conforming output, compute correct hashes, and sign a receipt that would verify. Detection requires either implausible output or a higher assurance tier.

Execution lanes: what each adds

AgentVault defines two execution lanes. The protocol structure is stable across lanes; what changes is who must be trusted for truthful execution. Every guarantee statement must specify which lane it applies to.

API-mediated (software lane). The relay assembles the prompt from both inputs, calls an external LLM provider, and validates the output. The relay operator sees both inputs in plaintext — this is architectural, not a bug. The LLM provider sees the assembled prompt. Guarantees in this lane: consent boundary enforcement between participants, signal compression via schema validation, cryptographic binding in the receipt. Trust requirements: the relay operator and LLM provider are trusted parties.

Confidential (TEE lane). The relay runs inside a confidential virtual machine (CVM) with hardware attestation (e.g., AMD SEV-SNP). The TEE lane adds: the relay operator cannot observe inputs or fabricate outputs without breaking the hardware attestation. The trust boundary shrinks to the hardware and the attested code running inside it. Assurance level: TEE_ATTESTED. The receipt includes TEE attestation data (attestation hash, transcript hash, receipt signing pubkey) that can be verified against the hardware vendor's root of trust.

What the TEE lane does not change. Both lanes enforce the same consent boundary, the same schema validation, and the same receipt structure. The difference is the trust envelope — who must be trusted. The API-mediated lane requires trusting the relay operator and provider. The TEE lane replaces operator trust with hardware attestation.

Trust boundaries

Relay operator. In the API-mediated lane, the relay operator has full visibility: both inputs, the assembled prompt, the raw LLM output, the bounded result, and all runtime state. Deploying AgentVault does not constrain operator access. The operator is a trusted party. In the TEE lane, the operator is excluded from the trust envelope — the CVM prevents the operator from observing plaintext inputs.

LLM provider. The provider sees the assembled prompt (both inputs merged into the prompt template) and produces the raw output. Provider data retention and logging policies are outside the protocol's scope. A provider attestation tier (PROVIDER_ATTESTED) is defined in the receipt schema but not yet implemented.

Participants. Each participant sees only their own input and the bounded output. The consent boundary prevents cross-participant input disclosure. The bounded output is the only information that flows between participants — structurally constrained by the schema.

Current limitations

In the software lane, AgentVault operates at SELF_ASSERTED assurance level. Receipt verification proves provenance (which relay signed it), not integrity of the underlying execution. The TEE lane is operational — validated on AMD SEV-SNP confidential VMs (GCP N2D, AMD Milan) — and raises assurance to TEE_ATTESTED, where the operator is excluded from plaintext and the receipt is bound to the measured execution environment. Provider attestation (PROVIDER_ATTESTED) is defined in the receipt schema but not yet implemented. Inputs travel over TLS but are not encrypted end-to-end — in the software lane, the relay has plaintext access. Cross-session entropy budget accumulation is not yet implemented (the budget resets per session). Fair exchange (simultaneous delivery to both participants) is not addressed by the protocol.