Agentic HITL Flywheel
Reusable human-in-the-loop security architecture behind Trinity's acquisition workflow.
Executive summary
Trinity's acquisition product is the demo surface. The reusable primitive underneath is an Agentic Human-in-the-Loop Flywheel: Phoenix/BEAM owns authority, Postgres owns truth, Oban makes work durable, LiveView/PubSub makes state visible, Hermes proposes work, Jido blocks unsafe or unreleased context, NVIDIA/Nemotron adds structured model review, ToolRouter executes only approved external effects, and proof packets make the resulting operation inspectable.
This page is written for AI security reviewers evaluating whether an agent can reason, learn, and operate without becoming the authority layer. The answer in Trinity is deliberately asymmetric: agents can propose plans, drafts, skills, and action intents; Phoenix contexts, database constraints, policy modules, approval records, and ToolRouter decide what can persist or execute.
Short version
| Short version |
|---|
| Hermes proposes. |
| Jido checks content provenance. |
| NVIDIA reviews risk. |
| A human edits or approves. |
| ToolRouter executes. |
| Phoenix/Postgres records proof. |
| Hermes learns from the outcome. |
Why normal HITL is not enough
Basic HITL usually means "show a human a draft before sending." That is insufficient for agentic systems because the danger is not only the final send. Risk enters through retrieved documents, inbound email, attachments, skill memory, tool payloads, model outputs, payment links, account scope, and cross-tenant reuse.
| Normal HITL gap | Agentic risk | Trinity control |
|---|---|---|
| Reviews only final output | Prompt-injected email or document content can shape hidden intermediate steps |
Jido-backed ContentFirewall blocks raw or unreleased content before hosted Hermes receives it
|
| Human approves a vague action | Agent intent can drift from the approved payload |
ApprovalRequest stores provider, operation, resource, amount, risk, idempotency key, and decision events
|
| Tool calls are direct | Model/session state can execute side effects without audit |
ToolRouter is the single external side-effect boundary
|
| Model result is opaque | Safety review cannot be inspected later |
ModelDecision stores structured output, route metadata, hashes, risk, and rationale summary
|
| Memory is global | Skills learned from one tenant can leak into another | Hermes skills and profiles are scoped by project, account, or immutable global-core visibility |
| UI state is authority | Browser state can hide retry/failure behavior | Postgres rows, Oban jobs, RunEvents, ToolCalls, AuditEvents, and proof packets are authority |
Full flywheel
Production path
User intent
-> Phoenix project workspace
-> Scoped context pack
-> Hermes plan or action intent
-> Jido/content-firewall and provenance checks
-> Deterministic policy review
-> NVIDIA/Nemotron risk, safety, scoring, or QA review where useful
-> Human edit or approval
-> ToolRouter execution
-> Postgres audit, ledger, RunEvent, ToolCall, ModelDecision, and proof-packet records
-> Hermes memory or skill update
Control plane and durable truth
| Layer | Role in flywheel | Implemented posture |
|---|---|---|
| Phoenix/BEAM | Control plane, route protection, LiveView operator sessions, PubSub visibility, supervision tree | Implemented for public docs, account/project workspaces, CRM, inbox, approvals, ledger, status, documents, and runtime surfaces |
| Context modules | Business authority for revenue, CRM, agency/project state, approvals, email, Hermes, audit, and integrations | Implemented; UI and controllers call contexts rather than writing raw database state |
| Postgres/Ecto | Durable source of truth with UUID schemas, changesets, foreign keys, unique constraints, check constraints, and account/project scope | Implemented across approvals, CRM, orders, revenue, Gmail, documents, ToolCalls, RunEvents, ModelDecisions, skills, and proof records |
| Oban | Durable background orchestration and retries for runtime/provider work | Implemented for Hermes runs, Gmail polling/sending, skill promotion, and operational worker slices |
| LiveView/PubSub | Real-time human visibility into approvals, runs, inbox, CRM, documents, ledger, and status | Implemented UI pattern; browser observes server state and does not own authority |
The BEAM matters because agent operations are concurrent, interruptible, and failure-prone. LiveViews can keep operators in the loop while workers retry, supervisors restart processes, and Postgres remains the durable authority.
Hermes role
Hermes proposes, plans, drafts, reasons over released context, and learns reusable skills. Trinity wraps that work in project/account scope. Hermes does not receive raw secrets, OAuth tokens, unreleased inbound content, or authority to approve high-risk actions.
| Hermes object | Source-backed implementation | Security boundary |
|---|---|---|
| Project chat and context pack |
Hermes.Message, Hermes.Session, Hermes.ContextPack, Hermes.ProjectContext, Hermes.ProjectTools
| Context is constructed from scoped database state |
| Hosted runtime task |
AI.HermesRuntime, AI.HermesRuntimeClient, Workers.HermesRunWorker, Tools.Hermes
| Runtime submission passes content firewall and ToolRouter policy |
| Action intent |
Hermes.ActionIntent and approval conversion paths
| High-risk intent becomes an approval request before execution |
| Memory and skills |
Hermes.AgentMemory, Hermes.MemoryStore, Hermes.Skill, SkillRegistry, SkillPromotionWorker
| Project/account/global-core visibility prevents accidental global reuse |
| Read-only project tools |
Hermes.ProjectTools and TrinitySales.capability_pack/2
| Sales capabilities describe possible actions; they do not mutate state or execute side effects |
See Hermes Integration and Vendor map .
Jido and content firewall role
Jido is intentionally narrow. It is not the business authority layer. Trinity uses it as a typed action seam for content and provenance checks before hosted Hermes receives payloads that may include references to documents, email, or attachments.
| Check | Implemented module | Fail-closed behavior |
|---|---|---|
| Raw inbound content scan |
Security.ContentFirewall.StaticScan
| Blocks fields such as raw email/document/body text from runtime payloads |
| Jido typed action |
Security.ContentFirewall.PayloadCheckAction
| Missing Jido dependency returns a denial, not a soft allow |
| Project document provenance |
Security.ContentFirewall.validate_ref/2 for project_document:<uuid> refs
| Blocks missing, cross-account, metadata-only, or malformed release refs |
| Email message provenance |
EmailMessage release status and scope checks
| Blocks cross-account or malformed released email refs |
| Attachment provenance |
EmailAttachment release status and scope checks
| Blocks unreleased or metadata-only attachment text from runtime handoff |
| ToolRouter policy |
Policies.ContentFirewallPolicy
| Blocks hosted-Hermes create-plan and skill-event calls carrying unsafe payloads |
This creates a human-governed content release boundary for indirect prompt injection defense: inbound content can exist in the system, but Hermes should see only metadata or explicitly released content refs until a human releases it.
See Jido , Security governance , and Documents and artifacts .
NVIDIA and Nemotron role
NVIDIA/Nemotron is used for structured review, not invisible authority. ModelRouter chooses official NVIDIA routes by default, keeps fallback aggregators disabled unless explicitly enabled, sanitizes payloads, stores hashes, and records ModelDecision evidence.
| Use case | Route/proof object | Current posture |
|---|---|---|
| Lead scoring and classification |
Tools.Nemotron, ModelRouter.record_decision/1, ModelDecision
| Implemented structured decision path |
| Draft safety review |
Email.DraftSafetyReview -> ToolRouter.execute(:nemotron, :evaluate_outbound_copy, ...)
| Implemented fail-closed behavior for unavailable, invalid, or unsafe result |
| High-risk action review |
Approval metadata can link model_decision_id
| Implemented evidence query helpers; specific policy coverage varies by action type |
| Deliverable QA |
Revenue.DeliverableQA and deliverable metadata
| Implemented for review-gated acquisition snapshot deliverables |
| Hidden reasoning posture |
ModelDecision rationale summary, hashes, structured output
| Implemented; hidden reasoning is not stored |
See NVIDIA , ModelRouter , and AI governance posture .
ToolRouter role
ToolRouter is the audited side-effect membrane. External effects should pass through AutonomousAgency.Tools.ToolRouter or an explicitly equivalent audited boundary. The router resolves account scope, enforces idempotency, runs policy modules, records denials, executes adapters, creates ToolCall rows, links model decisions where applicable, and appends audit/sandbox proof.
| ToolRouter gate | Purpose |
|---|---|
| Scoped idempotency key | Prevents replayed duplicate external actions |
| Account/run scope resolution | Blocks mismatched account IDs between run, opts, and payload |
| Replay/demo/sandbox/runtime policies | Keeps sample, replay, and live execution modes distinct |
| Content firewall policy | Prevents unsafe Hermes payload handoff |
| Sensitive action/spend/outbound policies | Escalates or blocks high-risk side effects |
| Adapter allowlist | Limits external operations to configured provider adapters |
| ToolCall/AuditEvent proof | Makes successful, failed, and denied actions inspectable |
See ToolRouter and Live action gates .
Human approval role
Humans are not a decorative checkbox. The approval system is durable, account-scoped, append-only at the event layer, and tied to role permissions. ApprovalRequest records the action type, provider, operation, resource, amount, currency, risk summary, idempotency key, linked run/tool/model records, and decision user. ApprovalEvent records requested, approved, rejected, or expired transitions with unique idempotency.
| Approval type examples | Why it escalates |
|---|---|
outreach_launch, followup_send
| Real outbound email can affect reputation, compliance, and user trust |
spend, agent_purchase, saas_provisioning
| Money movement and procurement need explicit caps and human authority |
credential_change, security_exception, autonomy_mode_change
| Security posture changes must not be delegated to the agent |
artifact_delivery
| Customer-facing outputs need review, safety evidence, and delivery proof |
See Approvals and Security governance .
Gmail, inbound quarantine, and draft gates
Gmail is account-connected and project-selected. The platform owns one Google OAuth app, each account connects its own mailbox, each project selects an approved account alias, and OAuth tokens stay behind vault refs.
| Flow | Implemented controls |
|---|---|
| Inbound email and attachment ingestion |
Email.ThreadIngestor, EmailMessage, EmailAttachment; content status and release scope track quarantine/release posture
|
| Human release before runtime | Content firewall validates released refs and blocks unreleased/raw content handoff |
| Draft creation |
EmailDraft, DraftSafetyReview, ToolRouter/Nemotron safety, approval-pending status
|
| Gmail draft/send |
Tools.GmailAdapter, Email.GmailClient, ProjectGmailSetting, SendingAlias
|
| Live send | Requires live outreach setting, account mailbox, project alias, suppression/policy checks, approval, idempotency, and ToolRouter proof |
See Gmail outreach and Security governance .
Proof and audit chain
Proof packets turn the flywheel into an inspectable graph. Governance.ProofPacket.for_run/1 gathers run events, state snapshots, sandbox checks, tool calls, model decisions, approvals, approval events, audit events, skills, chat messages, action intents, revenue events, orders, order intakes, deliverables, CRM records, email drafts, artifacts, and documents. Missing proof is reported as degraded; it is not fabricated.
| Proof record | What it proves |
|---|---|
RunEvent
| Timeline event with sequence, mode, idempotency key, and proof hash |
ToolCall
| Provider, operation, request/response summary, policy result, sandbox posture, status |
ModelDecision
| Structured model review with route, provider, schema, risk, hashes, rationale summary |
ApprovalRequest / ApprovalEvent
| Human authority and decision history |
AuditEvent
| System action history with proof hashes and sanitized metadata |
RevenueEvent / Order
| Money-to-work chain for paid runs |
CRM records
| Company/contact/deal/activity source of truth |
EmailDraft and artifacts
| Review-gated outputs and customer-facing deliverables |
Hermes.Skill
| Learned/reused capability with scope and promotion state |
See Audit ledger and Data flow .
Example: snapshot request to proof packet
| Step | System action | Human/agent boundary | Proof |
|---|---|---|---|
| Public request |
SnapshotRequestController calls WebIntake.create_snapshot_request/2
| Honeypot rejects bots; owner intake must be configured |
WebIntakeEvent, CRM company/contact/deal/activity
|
| Trinity Growth |
WebIntake.bootstrap_trinity_growth_project/1 creates or repairs owner project
| Owner team sees the project; outside users do not sell snapshots | Project settings and project memberships |
| Lead review |
TrinitySales.lead_context/3 and capability_pack/2 expose read-only capabilities
| Hermes may propose next steps but does not mutate or send | Safe lead context, source refs |
| Agent proposal | Hermes drafts plan/action intent from scoped context | High-risk actions become approval requests | Hermes message/action intent, RunEvent |
| Outreach or reply | Draft passes deterministic copy policy and NVIDIA/Nemotron review | Owner edits or approves before send | EmailDraft, ModelDecision, ApprovalRequest |
| Stripe payment | Approved checkout action uses Stripe rail or configured payment link; webhook verifies paid checkout | Access is granted only after verified paid snapshot proof | CheckoutSession, Order, RevenueEvent, Deal |
| Buyer access | Paid snapshot buyer email can receive a magic link | Buyer finishes their own account; proof remains scoped | Access token delivery metadata, account context |
| Delivery | Snapshot deliverable/artifact is reviewed, QA-scored, and delivered through signed links where applicable | Human-reviewed delivery only | Deliverable, ModelDecision QA, ProjectArtifact, proof packet |
This is not a hardcoded sales funnel. TrinitySales is a read-only capability contract. The live work remains available to Hermes through scoped project context, approvals, ToolRouter, Stripe, Gmail, CRM, and proof primitives.
Implemented vs planned/future posture
| Status label | Meaning |
|---|---|
| Implemented | Shipped source-backed behavior with local test coverage |
| Implemented foundation | Core data model, context, UI, worker, or adapter path exists; live operational proof may depend on deployment/provider configuration |
| Policy-ready | Controls, caps, declarations, or approval seams exist, but autonomous live execution is intentionally gated |
| Configured where credentials exist | Works when the relevant live provider keys, OAuth connection, webhook, or alias configuration is present |
| Live-provider dependent | Requires deployed credentials, provider availability, and smoke evidence before claiming live production operation |
| Future / planned | Architecture-supported but not claimed as complete live enforcement |
| Area | Current state |
|---|---|
| Phoenix project/account control plane | Implemented |
Public docs, llms.txt, sitemap, Markdown routes
| Implemented |
| Project-scoped Hermes profiles, memory, skills, and action-intent records | Implemented foundation |
| Hosted Hermes runtime submission | Implemented foundation; live provider smoke is live-provider dependent |
| Jido content firewall | Implemented narrow fail-closed payload/provenance seam |
| Inbound email/document quarantine and release | Implemented data model and firewall checks; operational workflows continue to expand |
| NVIDIA ModelRouter and ModelDecision proof | Implemented with official-route preference and fail-closed safety review paths |
| Gmail OAuth, alias selection, draft/send adapter | Configured where credentials exist; live send remains gated by alias, policy, approval, and live outreach settings |
| Stripe Checkout/webhook/revenue proof | Configured where credentials exist; live payment proof depends on Stripe keys, webhook configuration, and deployed smoke |
| Buyer upgrade magic link after paid snapshot | Implemented foundation for verified paid snapshot orders |
| Agent autonomous spend and SaaS provisioning | Policy-ready and skill-declared; real spend remains approval/cap/live-config gated |
| Generalized enterprise use beyond acquisition | Architecture-supported; acquisition is the flagship demo surface |
What this is not
- Not a generic approval button.
- Not a chatbot with direct tool access.
- Not a replacement for deterministic policy.
- Not a claim that models are authority.
- Not a claim that every runtime is already fully sandbox-enforced.
- Not a system that exposes raw secrets, raw emails, hidden reasoning, or unreleased content to the agent.
Why this generalizes beyond acquisition
The acquisition workflow exercises the hard parts: untrusted inbound content, CRM state, payment, outbound email, model review, human approval, provider tools, generated deliverables, skill reuse, and proof. The same flywheel can govern customer support, security triage, procurement, finance ops, research workflows, compliance review, and enterprise back-office agents because the reusable primitive is not "sales automation." It is scoped intent -> reviewed plan -> typed policy -> human authority -> audited execution -> durable memory.
Open-source extraction path
The full Trinity product is a commercial acquisition system, but the reusable primitive can be extracted as an open-source reference architecture: action intents, approval records, policy decisions, ToolRouter interfaces, content-firewall checks, proof-event schemas, and sample adapters.
Public-safety disclosure
This page is public-safe. It names modules, routes, schemas, and architecture boundaries without exposing credential values, OAuth tokens, private customer data, unreleased inbound content, raw model context, or hidden reasoning. Model and tool records store structured summaries, hashes, routes, risk labels, idempotency keys, and proof links rather than private payload dumps.
Internal reading path
Official references
| System | Use in Trinity | Official docs |
|---|---|---|
| Hermes | Hosted agent runtime and skills context | Hermes Agent docs |
| Jido | Narrow policy/firewall action seam | Jido Actions and Workflows |
| NVIDIA | Nemotron scoring, safety, and QA decisions | NIM LLM API reference |
| Stripe | Checkout, webhooks, revenue proof, guarded spend | Checkout Sessions API |
| Gmail | Drafts, sends, aliases, scopes, inbound replies | Gmail API scopes |
| Phoenix/Oban/Postgres | Control plane, durable jobs, source-of-truth data | Phoenix LiveView |
Source paths
docs/ARCHITECTURE.mddocs/security/ai-governance-posture.mddocs/security/threat-model.mdlib/autonomous_agency_web/router.exlib/autonomous_agency/application.exlib/autonomous_agency/approvals.exlib/autonomous_agency/approvals/approval_request.exlib/autonomous_agency/approvals/approval_event.exlib/autonomous_agency/tools/tool_router.exlib/autonomous_agency/security/content_firewall.exlib/autonomous_agency/policies/content_firewall_policy.exlib/autonomous_agency/jido_runtime.exlib/autonomous_agency/ai/model_router.exlib/autonomous_agency/ai/model_decision.exlib/autonomous_agency/email/draft_safety_review.exlib/autonomous_agency/email/thread_ingestor.exlib/autonomous_agency/tools/gmail_adapter.exlib/autonomous_agency/web_intake.exlib/autonomous_agency/trinity_sales.exlib/autonomous_agency/revenue.exlib/autonomous_agency/revenue/stripe_autonomous_skills.exlib/autonomous_agency/governance/proof_packet.ex