Sectum AI threat model
Sectum AI is a security product. The threat model is the document a careful CISO or security-architect buyer reads before adopting it. It states what Sectum AI protects against, what it doesn't, where the trust boundaries sit, and how the most sensitive artifacts (the ground-truth manifest, the evidence packs) are handled.
What Sectum AI protects against
Sectum AI verifies and attests that a multi-tenant AI system enforces tenant isolation across 13 surfaces. Specifically, the attack catalog covers:
- Direct cross-tenant boundary fetches (BOLA-style API access from one tenant for another tenant's resources)
- Organic-entity-bleed RAG leakage (benign cross-tenant queries surfacing foreign data through shared embeddings or shared indexes)
- RAG corpus poisoning that biases retrieval to surface foreign content
- Semantic-cache contamination (a cache serving one tenant's cached answer to another)
- KV-cache timing side channels (statistical distinguishability from a shared prefix cache)
- Embedding inversion across tenants
- Cross-tenant agent tool-call hijacking, including MCP confused-deputy and token-passthrough patterns
- Persistent agent-memory contamination
- LoRA / adapter cross-tenant influence (weight bleed, adapter mis-routing, memorized content surfacing)
- IKEA-style implicit benign extraction (multi-turn benign queries that reconstruct foreign content)
- GDPR Article 17 erasure non-completion across AI surfaces
What Sectum AI does NOT do
Out of scope, by design:
- Remediation. Sectum AI reports findings and points at the surface and the residual marker count; it does not edit the customer's stack to fix the leak. A SOC 2 auditor or a DPO needs an honest report, not an auto-fixer. Remediation belongs with the platform team.
- Runtime protection. Sectum AI is a verifier, not a guardrail. It does not sit in the request path of a production LLM, does not block prompt injections at request time, does not enforce policy on agent tool calls in real time. Runtime enforcement is a different product category.
- General LLM red-teaming. Sectum AI's scope is multi-tenant isolation. Jailbreaks, generic prompt injection, harm taxonomies, content-policy evals are out of scope; tools that specialize in those (DeepTeam, garak, PyRIT, promptfoo) do that work better.
- Compliance certification. The evidence pack maps findings to controls (SOC 2 CC6.x, ISO 27001 A.8.x, GDPR Art. 17 / 32, EU AI Act Art. 15, NIST AI RMF MEASURE 2.7). The mappings are assertions of test coverage, not legal certification. The buyer's auditor or counsel makes the certification call.
- Detection at scale. Sectum AI is a periodic verifier, not an always-on monitor. Continuous tiers run probes on a schedule (monthly default); they do not stream telemetry or open incidents in real time.
Trust boundaries
Sectum AI runs in two deployment modes; the trust boundary differs between them.
BYOC (bring-your-own-cloud)
The sectum-ai CLI runs inside the customer's
environment. Adapters connect to the customer's AI surfaces via
references in the customer's environment variables; raw secrets
never cross the Sectum AI boundary. The substrate is provisioned in
the customer's tenancy; canary markers are planted in the
customer's stack; probes run there too. Only the signed evidence
pack (containing hashes, markers' IDs, control mappings) leaves
the customer's environment.
BYOC is the right mode for a customer who needs a hard data-egress boundary — e.g., a regulated industry vertical or a customer whose stack contains regulated data the buyer cannot send to a third party.
Hosted
Sectum Cloud runs the synthetic-tenant substrate against the customer's reachable endpoints. Adapter configurations resolve customer secrets from a secret-manager reference the customer controls; Sectum AI receives the reference, not the secret. Probe runs and evidence pack assembly happen on Sectum AI-managed infrastructure.
Hosted is the right mode for faster onboarding and lower operational burden, with the trade-off that the Sectum AI-managed runner has read paths into the customer's reachable AI endpoints during probe runs.
The ground-truth manifest is sensitive
The ground-truth manifest is the authoritative record of which canary marker belongs to which synthetic tenant. It is the basis for the zero-false-positive property of confirmed findings — every leak is provably tied back to a planted marker.
It is treated as sensitive on three fronts:
- At rest. The manifest supports encryption at rest via an operator-supplied key reference. Production deployments are expected to enable this; the spec leans toward always-encrypt for BYOC.
- In transit. Manifest contents (raw marker plaintext, planted locations) never appear in the published evidence pack — only the manifest hash is embedded in the evidence chain. Tampering with the manifest after a run breaks verification.
- In logs. Adapter calls log marker IDs and tenancy assertions, never raw plaintext above DEBUG level. DEBUG-level logging is off by default; turning it on requires an explicit operator action.
Customer-data handling
Sectum AI is synthetic by default. The default scenario uses the synthetic tenants Acme / Globex / Initech / Hooli with generated corpora; nothing in the default workflow requires real customer data.
When Sectum AI is pointed at a customer's stack:
- Probes interact with the customer's adapters via the configured endpoints. The probes read — they do not exfiltrate content; they record finding IDs and control mappings, not raw tenant data.
- In BYOC, only the signed evidence pack leaves. The pack contains: the canonical run JSON (run ID, scenario hash, manifest hash, finding records with marker IDs, control mappings), the TSA token, the Rekor inclusion proof, and the PDF rendered from those. No raw customer content.
- Hosted mode logs the same evidence-pack contents on the Sectum AI side; raw probe responses are processed in-memory and not retained.
Supply chain
- Dependencies pinned via uv lockfile; Dependabot tracks vulnerable updates.
- Pre-commit and CI run secret-scanning (gitleaks) on every push.
- Release artifacts are signed via Sigstore; SBOM generated as part of the release workflow.
- CodeQL scans land on the same CI pipeline.
Disclosure
Security issues should be reported to the address in the OSS repo's SECURITY.md: acknowledgement in 24 hours, triage in 72 hours, coordinated disclosure within 90 days.