Trinity

Status: implemented Version: latest Review: source-backed Last scanned: 2026-06-25T00:00:00Z Review required: false

Security And Governance

Approvals, redaction, ToolRouter, quarantine, credential boundaries, and audit proof.

Governance principles

Trinity defaults to safe and human-controlled. Agents may research, propose, draft, classify, prepare artifacts, and create action intents. Sensitive execution requires account scope, provider readiness, policy checks, idempotency, human approval, redaction, and audit proof.

Security is layered because the threat model is layered. The system must protect money, email reputation, private documents, inbound poisoned content, provider credentials, model context, account/team boundaries, generated skills, and public proof surfaces at the same time.

Control matrix

Risk	Control	Evidence
Indirect prompt injection	Quarantine and Jido-backed content firewall	Release audit and firewall decision metadata
Poisoned skill creation	Skills only created from scoped, reviewed, source-backed work	Skill scope, source run, review status, promotion proof
Cross-tenant memory leakage	Project/account/global-core Hermes scope constraints	`hermes_skills` indexes and `hermes_agent_profiles` shape checks
Unsafe outbound email	Gmail readiness, alias, suppression, safety review, approval	Draft, ModelDecision, ApprovalEvent, ToolCall
Unauthorized spend	Account caps, spend policy, approval, idempotency	ApprovalRequest, ToolCall, ledger entry
Provider secrets leakage	Credential vault path, redaction, no secrets in assigns/logs/proof	Safe readiness JSON and audit summaries
Tool bypass	ToolRouter as side-effect boundary	ToolCall status and AuditEvent
Model ambiguity	Structured schemas and ModelDecision records	Validated decision status and route proof
Webhook replay	Signed event verification and event processing records	StripeWebhookEvent and idempotent order/revenue writes
Browser-side authority	LiveView server state and context-level authorization	Protected routes and scoped context calls
Fake/demo proof confusion	Sample/fallback/live labels in records and UI	Ledger/status labels and docs disclosure

Human-in-the-middle content release

Inbound emails, uploaded docs, and attachments remain metadata-only until a human reviews them. Hermes can reference that an object exists, but cannot read the original text before release. This protects against malicious instructions hidden in email bodies, PDFs, images, encoded text, or copied documents.

Content state	Hermes access	Operator action
Quarantined inbound email	Metadata, sender, subject, source ID	Review and release selected content
Quarantined attachment	Filename, type, hash, storage ref	Preview/download and release if safe
Released document	Bounded content or source ref	Use in project context and artifacts
Generated artifact	Artifact summary and proof links	Approve/share/download
Skill candidate	Summary, diff, source run	Approve, promote, retire, or delete

Redaction posture

Public docs and proof surfaces expose:

Record IDs, statuses, timestamps, hashes, provider labels, and summaries.
Source paths and code modules for audit.
Approval outcomes and policy denial reasons.

They do not expose:

Credential values.
Private customer payloads.
Raw model context.
Hidden model reasoning.
Unreleased inbound email or document content.

Approval and action boundaries

Action	Default	Required before execution
Gmail draft creation	Allowed only through configured Gmail path	OAuth mailbox, alias readiness, policy check
Gmail send/reply	Blocked by default	`LIVE_OUTREACH`, alias, suppression check, safety review, ApprovalEvent
Stripe checkout/revenue	Allowed when configured for public offer	Signed webhook and idempotent processing
Agent spend/provisioning	Blocked by default	`LIVE_SPEND`, cap check, approval, idempotency
Tool execution	Blocked if direct or malformed	ToolRouter policy and adapter readiness
Model review	Required for risky copy/deliverables	Valid ModelDecision schema and fail-closed parsing
Secret changes	Never autonomous	Account owner action outside agent authority
Refunds/disputes/legal incidents	Never autonomous	Human/manual escalation

Credential and readiness boundaries

Trinity separates provider readiness from secret values. UI surfaces can show that NVIDIA, Stripe, Gmail, Hermes, Tigris, or other providers are configured or missing, but secret values are never rendered in LiveView assigns, logs, readiness JSON, docs, ToolCall rows, prompts, model context, or public proof.

Provider family	Safe public/readiness data	Private data
NVIDIA	Route name, provider status, model label, decision status	API key, raw prompt context, private payload
Stripe	Mode, webhook status, event IDs, checkout IDs	Secret key, webhook signing secret
Gmail	Mailbox status, alias, thread/message IDs	OAuth refresh token, raw unreleased inbound body
Hermes	Runtime health, profile scope, skill scope	Runtime credentials, private memory content
Storage	Object key/hash/status	Unreleased private document body

Primary source links

Official references

System	Use in Trinity	Official docs
Gmail	OAuth scope and send/draft boundaries	Gmail API scopes
Gmail drafts	Draft creation and send flow	Gmail drafts guide
Stripe	Signed webhook verification	Stripe webhooks
Jido	Typed firewall action seam	Jido Action

Source paths

docs/security/threat-model.md
docs/security/ai-governance-posture.md
lib/autonomous_agency/tools/tool_router.ex
lib/autonomous_agency/approvals.ex
lib/autonomous_agency/audit.ex
lib/autonomous_agency/security/redaction.ex
lib/autonomous_agency/security/content_firewall.ex

Was this page useful? Source-backed feedback keeps public docs honest.