Agentic HITL Flywheel

Reusable human-in-the-loop security architecture behind Trinity's acquisition workflow.

Docs menu
Keyboard: / focus search Cmd/Ctrl+K open command menu
Status: implemented Version: latest Review: source-backed Last scanned: 2026-06-25T00:00:00Z Review required: false

Agentic HITL Flywheel

Reusable human-in-the-loop security architecture behind Trinity's acquisition workflow.

Executive summary

Trinity's acquisition product is the demo surface. The reusable primitive underneath is an Agentic Human-in-the-Loop Flywheel: Phoenix/BEAM owns authority, Postgres owns truth, Oban makes work durable, LiveView/PubSub makes state visible, Hermes proposes work, Jido blocks unsafe or unreleased context, NVIDIA/Nemotron adds structured model review, ToolRouter executes only approved external effects, and proof packets make the resulting operation inspectable.

This page is written for AI security reviewers evaluating whether an agent can reason, learn, and operate without becoming the authority layer. The answer in Trinity is deliberately asymmetric: agents can propose plans, drafts, skills, and action intents; Phoenix contexts, database constraints, policy modules, approval records, and ToolRouter decide what can persist or execute.

Short version

Short version
Hermes proposes.
Jido checks content provenance.
NVIDIA reviews risk.
A human edits or approves.
ToolRouter executes.
Phoenix/Postgres records proof.
Hermes learns from the outcome.

Why normal HITL is not enough

Basic HITL usually means "show a human a draft before sending." That is insufficient for agentic systems because the danger is not only the final send. Risk enters through retrieved documents, inbound email, attachments, skill memory, tool payloads, model outputs, payment links, account scope, and cross-tenant reuse.

Normal HITL gap Agentic risk Trinity control
Reviews only final output Prompt-injected email or document content can shape hidden intermediate steps Jido-backed ContentFirewall blocks raw or unreleased content before hosted Hermes receives it
Human approves a vague action Agent intent can drift from the approved payload ApprovalRequest stores provider, operation, resource, amount, risk, idempotency key, and decision events
Tool calls are direct Model/session state can execute side effects without audit ToolRouter is the single external side-effect boundary
Model result is opaque Safety review cannot be inspected later ModelDecision stores structured output, route metadata, hashes, risk, and rationale summary
Memory is global Skills learned from one tenant can leak into another Hermes skills and profiles are scoped by project, account, or immutable global-core visibility
UI state is authority Browser state can hide retry/failure behavior Postgres rows, Oban jobs, RunEvents, ToolCalls, AuditEvents, and proof packets are authority

Full flywheel

Production path

User intent
-> Phoenix project workspace
-> Scoped context pack
-> Hermes plan or action intent
-> Jido/content-firewall and provenance checks
-> Deterministic policy review
-> NVIDIA/Nemotron risk, safety, scoring, or QA review where useful
-> Human edit or approval
-> ToolRouter execution
-> Postgres audit, ledger, RunEvent, ToolCall, ModelDecision, and proof-packet records
-> Hermes memory or skill update

Control plane and durable truth

Layer Role in flywheel Implemented posture
Phoenix/BEAM Control plane, route protection, LiveView operator sessions, PubSub visibility, supervision tree Implemented for public docs, account/project workspaces, CRM, inbox, approvals, ledger, status, documents, and runtime surfaces
Context modules Business authority for revenue, CRM, agency/project state, approvals, email, Hermes, audit, and integrations Implemented; UI and controllers call contexts rather than writing raw database state
Postgres/Ecto Durable source of truth with UUID schemas, changesets, foreign keys, unique constraints, check constraints, and account/project scope Implemented across approvals, CRM, orders, revenue, Gmail, documents, ToolCalls, RunEvents, ModelDecisions, skills, and proof records
Oban Durable background orchestration and retries for runtime/provider work Implemented for Hermes runs, Gmail polling/sending, skill promotion, and operational worker slices
LiveView/PubSub Real-time human visibility into approvals, runs, inbox, CRM, documents, ledger, and status Implemented UI pattern; browser observes server state and does not own authority

The BEAM matters because agent operations are concurrent, interruptible, and failure-prone. LiveViews can keep operators in the loop while workers retry, supervisors restart processes, and Postgres remains the durable authority.

Hermes role

Hermes proposes, plans, drafts, reasons over released context, and learns reusable skills. Trinity wraps that work in project/account scope. Hermes does not receive raw secrets, OAuth tokens, unreleased inbound content, or authority to approve high-risk actions.

Hermes object Source-backed implementation Security boundary
Project chat and context pack Hermes.Message, Hermes.Session, Hermes.ContextPack, Hermes.ProjectContext, Hermes.ProjectTools Context is constructed from scoped database state
Hosted runtime task AI.HermesRuntime, AI.HermesRuntimeClient, Workers.HermesRunWorker, Tools.Hermes Runtime submission passes content firewall and ToolRouter policy
Action intent Hermes.ActionIntent and approval conversion paths High-risk intent becomes an approval request before execution
Memory and skills Hermes.AgentMemory, Hermes.MemoryStore, Hermes.Skill, SkillRegistry, SkillPromotionWorker Project/account/global-core visibility prevents accidental global reuse
Read-only project tools Hermes.ProjectTools and TrinitySales.capability_pack/2 Sales capabilities describe possible actions; they do not mutate state or execute side effects

See Hermes Integration and Vendor map .

Jido and content firewall role

Jido is intentionally narrow. It is not the business authority layer. Trinity uses it as a typed action seam for content and provenance checks before hosted Hermes receives payloads that may include references to documents, email, or attachments.

Check Implemented module Fail-closed behavior
Raw inbound content scan Security.ContentFirewall.StaticScan Blocks fields such as raw email/document/body text from runtime payloads
Jido typed action Security.ContentFirewall.PayloadCheckAction Missing Jido dependency returns a denial, not a soft allow
Project document provenance Security.ContentFirewall.validate_ref/2 for project_document:<uuid> refs Blocks missing, cross-account, metadata-only, or malformed release refs
Email message provenance EmailMessage release status and scope checks Blocks cross-account or malformed released email refs
Attachment provenance EmailAttachment release status and scope checks Blocks unreleased or metadata-only attachment text from runtime handoff
ToolRouter policy Policies.ContentFirewallPolicy Blocks hosted-Hermes create-plan and skill-event calls carrying unsafe payloads

This creates a human-governed content release boundary for indirect prompt injection defense: inbound content can exist in the system, but Hermes should see only metadata or explicitly released content refs until a human releases it.

See Jido , Security governance , and Documents and artifacts .

NVIDIA and Nemotron role

NVIDIA/Nemotron is used for structured review, not invisible authority. ModelRouter chooses official NVIDIA routes by default, keeps fallback aggregators disabled unless explicitly enabled, sanitizes payloads, stores hashes, and records ModelDecision evidence.

Use case Route/proof object Current posture
Lead scoring and classification Tools.Nemotron, ModelRouter.record_decision/1, ModelDecision Implemented structured decision path
Draft safety review Email.DraftSafetyReview -> ToolRouter.execute(:nemotron, :evaluate_outbound_copy, ...) Implemented fail-closed behavior for unavailable, invalid, or unsafe result
High-risk action review Approval metadata can link model_decision_id Implemented evidence query helpers; specific policy coverage varies by action type
Deliverable QA Revenue.DeliverableQA and deliverable metadata Implemented for review-gated acquisition snapshot deliverables
Hidden reasoning posture ModelDecision rationale summary, hashes, structured output Implemented; hidden reasoning is not stored

See NVIDIA , ModelRouter , and AI governance posture .

ToolRouter role

ToolRouter is the audited side-effect membrane. External effects should pass through AutonomousAgency.Tools.ToolRouter or an explicitly equivalent audited boundary. The router resolves account scope, enforces idempotency, runs policy modules, records denials, executes adapters, creates ToolCall rows, links model decisions where applicable, and appends audit/sandbox proof.

ToolRouter gate Purpose
Scoped idempotency key Prevents replayed duplicate external actions
Account/run scope resolution Blocks mismatched account IDs between run, opts, and payload
Replay/demo/sandbox/runtime policies Keeps sample, replay, and live execution modes distinct
Content firewall policy Prevents unsafe Hermes payload handoff
Sensitive action/spend/outbound policies Escalates or blocks high-risk side effects
Adapter allowlist Limits external operations to configured provider adapters
ToolCall/AuditEvent proof Makes successful, failed, and denied actions inspectable

See ToolRouter and Live action gates .

Human approval role

Humans are not a decorative checkbox. The approval system is durable, account-scoped, append-only at the event layer, and tied to role permissions. ApprovalRequest records the action type, provider, operation, resource, amount, currency, risk summary, idempotency key, linked run/tool/model records, and decision user. ApprovalEvent records requested, approved, rejected, or expired transitions with unique idempotency.

Approval type examples Why it escalates
outreach_launch, followup_send Real outbound email can affect reputation, compliance, and user trust
spend, agent_purchase, saas_provisioning Money movement and procurement need explicit caps and human authority
credential_change, security_exception, autonomy_mode_change Security posture changes must not be delegated to the agent
artifact_delivery Customer-facing outputs need review, safety evidence, and delivery proof

See Approvals and Security governance .

Gmail, inbound quarantine, and draft gates

Gmail is account-connected and project-selected. The platform owns one Google OAuth app, each account connects its own mailbox, each project selects an approved account alias, and OAuth tokens stay behind vault refs.

Flow Implemented controls
Inbound email and attachment ingestion Email.ThreadIngestor, EmailMessage, EmailAttachment; content status and release scope track quarantine/release posture
Human release before runtime Content firewall validates released refs and blocks unreleased/raw content handoff
Draft creation EmailDraft, DraftSafetyReview, ToolRouter/Nemotron safety, approval-pending status
Gmail draft/send Tools.GmailAdapter, Email.GmailClient, ProjectGmailSetting, SendingAlias
Live send Requires live outreach setting, account mailbox, project alias, suppression/policy checks, approval, idempotency, and ToolRouter proof

See Gmail outreach and Security governance .

Proof and audit chain

Proof packets turn the flywheel into an inspectable graph. Governance.ProofPacket.for_run/1 gathers run events, state snapshots, sandbox checks, tool calls, model decisions, approvals, approval events, audit events, skills, chat messages, action intents, revenue events, orders, order intakes, deliverables, CRM records, email drafts, artifacts, and documents. Missing proof is reported as degraded; it is not fabricated.

Proof record What it proves
RunEvent Timeline event with sequence, mode, idempotency key, and proof hash
ToolCall Provider, operation, request/response summary, policy result, sandbox posture, status
ModelDecision Structured model review with route, provider, schema, risk, hashes, rationale summary
ApprovalRequest / ApprovalEvent Human authority and decision history
AuditEvent System action history with proof hashes and sanitized metadata
RevenueEvent / Order Money-to-work chain for paid runs
CRM records Company/contact/deal/activity source of truth
EmailDraft and artifacts Review-gated outputs and customer-facing deliverables
Hermes.Skill Learned/reused capability with scope and promotion state

See Audit ledger and Data flow .

Example: snapshot request to proof packet

Step System action Human/agent boundary Proof
Public request SnapshotRequestController calls WebIntake.create_snapshot_request/2 Honeypot rejects bots; owner intake must be configured WebIntakeEvent, CRM company/contact/deal/activity
Trinity Growth WebIntake.bootstrap_trinity_growth_project/1 creates or repairs owner project Owner team sees the project; outside users do not sell snapshots Project settings and project memberships
Lead review TrinitySales.lead_context/3 and capability_pack/2 expose read-only capabilities Hermes may propose next steps but does not mutate or send Safe lead context, source refs
Agent proposal Hermes drafts plan/action intent from scoped context High-risk actions become approval requests Hermes message/action intent, RunEvent
Outreach or reply Draft passes deterministic copy policy and NVIDIA/Nemotron review Owner edits or approves before send EmailDraft, ModelDecision, ApprovalRequest
Stripe payment Approved checkout action uses Stripe rail or configured payment link; webhook verifies paid checkout Access is granted only after verified paid snapshot proof CheckoutSession, Order, RevenueEvent, Deal
Buyer access Paid snapshot buyer email can receive a magic link Buyer finishes their own account; proof remains scoped Access token delivery metadata, account context
Delivery Snapshot deliverable/artifact is reviewed, QA-scored, and delivered through signed links where applicable Human-reviewed delivery only Deliverable, ModelDecision QA, ProjectArtifact, proof packet

This is not a hardcoded sales funnel. TrinitySales is a read-only capability contract. The live work remains available to Hermes through scoped project context, approvals, ToolRouter, Stripe, Gmail, CRM, and proof primitives.

Implemented vs planned/future posture

Status label Meaning
Implemented Shipped source-backed behavior with local test coverage
Implemented foundation Core data model, context, UI, worker, or adapter path exists; live operational proof may depend on deployment/provider configuration
Policy-ready Controls, caps, declarations, or approval seams exist, but autonomous live execution is intentionally gated
Configured where credentials exist Works when the relevant live provider keys, OAuth connection, webhook, or alias configuration is present
Live-provider dependent Requires deployed credentials, provider availability, and smoke evidence before claiming live production operation
Future / planned Architecture-supported but not claimed as complete live enforcement
Area Current state
Phoenix project/account control plane Implemented
Public docs, llms.txt, sitemap, Markdown routes Implemented
Project-scoped Hermes profiles, memory, skills, and action-intent records Implemented foundation
Hosted Hermes runtime submission Implemented foundation; live provider smoke is live-provider dependent
Jido content firewall Implemented narrow fail-closed payload/provenance seam
Inbound email/document quarantine and release Implemented data model and firewall checks; operational workflows continue to expand
NVIDIA ModelRouter and ModelDecision proof Implemented with official-route preference and fail-closed safety review paths
Gmail OAuth, alias selection, draft/send adapter Configured where credentials exist; live send remains gated by alias, policy, approval, and live outreach settings
Stripe Checkout/webhook/revenue proof Configured where credentials exist; live payment proof depends on Stripe keys, webhook configuration, and deployed smoke
Buyer upgrade magic link after paid snapshot Implemented foundation for verified paid snapshot orders
Agent autonomous spend and SaaS provisioning Policy-ready and skill-declared; real spend remains approval/cap/live-config gated
Generalized enterprise use beyond acquisition Architecture-supported; acquisition is the flagship demo surface

What this is not

  • Not a generic approval button.
  • Not a chatbot with direct tool access.
  • Not a replacement for deterministic policy.
  • Not a claim that models are authority.
  • Not a claim that every runtime is already fully sandbox-enforced.
  • Not a system that exposes raw secrets, raw emails, hidden reasoning, or unreleased content to the agent.

Why this generalizes beyond acquisition

The acquisition workflow exercises the hard parts: untrusted inbound content, CRM state, payment, outbound email, model review, human approval, provider tools, generated deliverables, skill reuse, and proof. The same flywheel can govern customer support, security triage, procurement, finance ops, research workflows, compliance review, and enterprise back-office agents because the reusable primitive is not "sales automation." It is scoped intent -> reviewed plan -> typed policy -> human authority -> audited execution -> durable memory.

Open-source extraction path

The full Trinity product is a commercial acquisition system, but the reusable primitive can be extracted as an open-source reference architecture: action intents, approval records, policy decisions, ToolRouter interfaces, content-firewall checks, proof-event schemas, and sample adapters.

Public-safety disclosure

This page is public-safe. It names modules, routes, schemas, and architecture boundaries without exposing credential values, OAuth tokens, private customer data, unreleased inbound content, raw model context, or hidden reasoning. Model and tool records store structured summaries, hashes, routes, risk labels, idempotency keys, and proof links rather than private payload dumps.

Internal reading path

Official references

System Use in Trinity Official docs
Hermes Hosted agent runtime and skills context Hermes Agent docs
Jido Narrow policy/firewall action seam Jido Actions and Workflows
NVIDIA Nemotron scoring, safety, and QA decisions NIM LLM API reference
Stripe Checkout, webhooks, revenue proof, guarded spend Checkout Sessions API
Gmail Drafts, sends, aliases, scopes, inbound replies Gmail API scopes
Phoenix/Oban/Postgres Control plane, durable jobs, source-of-truth data Phoenix LiveView

Source paths

  • docs/ARCHITECTURE.md
  • docs/security/ai-governance-posture.md
  • docs/security/threat-model.md
  • lib/autonomous_agency_web/router.ex
  • lib/autonomous_agency/application.ex
  • lib/autonomous_agency/approvals.ex
  • lib/autonomous_agency/approvals/approval_request.ex
  • lib/autonomous_agency/approvals/approval_event.ex
  • lib/autonomous_agency/tools/tool_router.ex
  • lib/autonomous_agency/security/content_firewall.ex
  • lib/autonomous_agency/policies/content_firewall_policy.ex
  • lib/autonomous_agency/jido_runtime.ex
  • lib/autonomous_agency/ai/model_router.ex
  • lib/autonomous_agency/ai/model_decision.ex
  • lib/autonomous_agency/email/draft_safety_review.ex
  • lib/autonomous_agency/email/thread_ingestor.ex
  • lib/autonomous_agency/tools/gmail_adapter.ex
  • lib/autonomous_agency/web_intake.ex
  • lib/autonomous_agency/trinity_sales.ex
  • lib/autonomous_agency/revenue.ex
  • lib/autonomous_agency/revenue/stripe_autonomous_skills.ex
  • lib/autonomous_agency/governance/proof_packet.ex
Was this page useful? Source-backed feedback keeps public docs honest.