Governance, Safety and Security

Complete governance and security specification for River Agents -- the autonomy model, per-tool ACL enforcement, approval gate mechanics, policy engine, defense-in-depth security layers, auditability architecture, tenant isolation, emergency controls, and compliance capabilities.

Quick Navigation

Governance Architecture Overview
4-Level Autonomy Model
Per-Tool ACL Enforcement
Approval Gate Mechanics
Policy Enforcement Engine
5-Layer Defense-in-Depth Security
Auditability and Traceability
Tenant Isolation
Emergency Controls
Compliance Capabilities

Governance Architecture Overview

Governance is enforced at every stage of the execution pipeline -- at configuration time, at trigger ingestion, at every reasoning turn, and at every tool dispatch. Four independent enforcement mechanisms run concurrently on each execution. A failure in one does not disable the others.

Mechanism	When It Runs	Enforced By	What It Prevents
Autonomy enforcement	Before every governed tool dispatch	Governance Service (Backend :8005)	Actions exceeding the configured action level
Per-tool ACL	At every tool dispatch, re-validated on each call	TLO Gateway (:8001)	Unauthorized tool invocations even from an active, properly configured agent
Policy evaluation	At every governance checkpoint	Policy Engine (Backend :8005)	Actions that pass the autonomy level check but violate an org or workspace policy
Audit writing	After every governance decision, approval event, and security enforcement	Audit Service (Backend :8005)	Governance events going unrecorded; the write happens before the decision signal is sent

Enforcement ordering: Autonomy level is checked before policy evaluation. A read_respond agent attempting a write tool never reaches the policy engine -- the autonomy block fires first. Policy evaluation only runs for tools that pass the autonomy level check.

4-Level Autonomy Model

The autonomy level is set on agent_versions.action_level and cannot be overridden at runtime. It is the primary boundary on what an agent can do without human involvement.

Level	DB Value	What the Agent Can Do	What the Agent Cannot Do	Human Role
1 -- Read and Respond	`read_respond`	Query data, analyze, summarize, generate reports, recommend visualizations	Execute any write operation, modify any external system, send any communication	Not required -- all output is informational
2 -- Recommend	`recommend`	Everything in Level 1 plus propose specific tool calls with full arguments and reasoning	Execute any proposed action -- all proposals are presented as suggestions only	Required -- user must manually execute each recommendation
3 -- Act with Approval	`act_with_approval`	Everything in Level 2 plus prepare, stage, and execute actions after human approval	Execute any gated action without explicit human sign-off	Required -- Approve, Reject, or Edit for each gated action
4 -- Fully Automated	`fully_automated`	Execute all allowed tools autonomously without waiting for approval	Execute tools outside its allowed set -- still bounded by ACL and active policies	Optional -- audit-only monitoring with anomaly alerts

Risk Assessment per Level

Level	Risk Profile	Recommended For	Example Agents
Read and Respond	Zero risk -- no system modifications possible	New agents, sensitive data domains, exploratory analytics	Data Analyst, Executive Briefing
Recommend	Low risk -- human always in the loop for execution	Early-stage agents building trust, complex decision workflows	Sales Lead Qualifier (initial phase)
Act with Approval	Medium risk -- actions execute only after human review	Most enterprise deployments; balances automation with oversight	Customer Support, Finance Reconciliation, Risk Compliance
Fully Automated	High risk -- actions execute immediately without human check	Well-tested, time-critical workflows; requires comprehensive monitoring	Operations Monitoring, SLA Remediation

Action Level Enforcement Matrix

Scenario	Read and Respond	Recommend	Act with Approval	Fully Automated
Read tool called	EXECUTE	EXECUTE	EXECUTE	EXECUTE
Write tool called	BLOCKED	SUGGESTED	GATED (approval) or EXECUTE (auto-approve)	EXECUTE
Write tool outside approval list	BLOCKED	SUGGESTED	EXECUTE	EXECUTE
Policy blocks the action	BLOCKED	BLOCKED	BLOCKED	BLOCKED
ACL check fails at TLO	BLOCKED	N/A (not dispatched)	BLOCKED	BLOCKED
Fully Automated policy absent	N/A	N/A	N/A	BLOCKED

Fully Automated policy requirement: An agent cannot be deployed at fully_automated action level unless an active governance_policies record with enforcement_action = 'allow_full_automation' is bound to the agent version. This prevents accidentally deploying an ungated agent by configuration error.

Action Level Progression

Agents are designed to earn trust incrementally. A typical enterprise progression:

Demotion happens automatically when health metrics degrade past configured thresholds. Promotion requires a deliberate reconfiguration and new version deploy.

Per-Tool ACL Enforcement

TLO Gateway re-validates the calling agent's permission to invoke each specific tool at the time of every dispatch. Permissions are not cached from session start. If an admin revokes a tool permission from an agent version while a run is in progress, the next tool call of that type within the same run is blocked at that point.

ACL Permission Namespace

All River Agent API operations use the agent: permission namespace. These are distinct from platform-level IAM permissions.

Permission	Grants
`agent:read`	View agent configuration, version history, and metrics
`agent:create`	Create new agents and agent versions
`agent:update`	Edit agent configuration (creates a new version)
`agent:delete`	Soft-delete (archive) an agent
`agent:deploy`	Transition an agent to `active` status
`agent:execute`	Trigger an agent execution manually
`agent:approve`	Resolve approval requests
`agent:audit`	View audit logs and execution logs
`agent:monitor`	View monitoring metrics and system health

Tool Execution Permission Matrix

Every tool that the agent can call at runtime maps to a required platform permission. TLO validates this permission against the user whose identity was injected at trigger ingestion.

Tool	Target Service	Required Permission
`execute_query`	Data Orchestration :8002	`data_source:query`
`discover_schema`	Backend :8005	`data_source:view`
`create_data_source`	Backend :8005	`data_source:create`
`update_data_source`	Backend :8005	`data_source:update`
`delete_data_source`	Backend :8005	`data_source:delete`
`test_connection`	Backend :8005	`data_source:view`
`write_back`	Data Orchestration :8002	`data_source:update` + confirmation
`apply_governance_policy`	Backend :8005	`policy:create`
`get_workspace_info`	Backend :8005	`workspace:view`
`get_storage_info`	Storage Service :8003	`storage:view`
Custom tools	Customer-configured service	Configured per tool in `agent_version_tools`

Double-check principle: TLO performs independent validation -- it does not trust the agent's reasoning or the Runner's assertions. Even if an agent's instruction_set states that it has CRM access, TLO rejects the call if the injected user identity does not carry the required permission. The agent's configuration and the runtime ACL check are independent layers.

Two-Tier Role System

River Agents RBAC operates on two scoping tiers. Organization-level roles govern org-wide operations; workspace-level roles govern agent operations within a workspace.

Organization-level roles:

Role	Scope	Capabilities
Org Admin	Organization-wide	Manage members, workspaces, billing, emergency controls
Org Editor	Organization-wide	Create and configure workspaces; cannot manage billing or delete org
Org Viewer	Organization-wide	View organization structure and workspace list

Workspace-level roles:

Role	Scope	Capabilities
Workspace Admin	Workspace	Full agent lifecycle including deploy, delete, settings, kill switch
Workspace Editor	Workspace	Create, configure, and approve agent actions; cannot deploy
Workspace Analyst	Workspace	Run agents manually; view outputs and metrics; cannot configure
Workspace Viewer	Workspace	Read-only access to agent definitions, executions, and audit logs
Workspace Auditor	Workspace	Full access to audit logs and execution traces; no write capabilities

Role-to-Permission Mapping

Role	`read`	`create`	`update`	`delete`	`deploy`	`execute`	`approve`	`audit`	`monitor`
Workspace Admin	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Workspace Editor	Yes	Yes	Yes	No	No	No	Yes	No	Yes
Workspace Analyst	Yes	No	No	No	No	Yes	No	No	Yes
Workspace Viewer	Yes	No	No	No	No	No	No	No	Yes
Workspace Auditor	Yes	No	No	No	No	No	No	Yes	Yes

TLO ACL Enforcement Sequence

Re-validation on every call: The permission check runs on every tool dispatch, not once per execution. The JWT passed by the Agent Runner is the agent's service-to-service token bound to the specific agent_version_id. If the version's ACL is updated mid-run, the check reflects the updated state on the next dispatch within the same run.

Approval Gate Mechanics

When an action is gated, execution pauses without consuming compute. The mechanics rely on Temporal's workflow.await() for zero-cost hibernation. This section documents the governance-specific aspects: request creation, the three resolution paths, approval content, expiration, and notification routing.

ApprovalRequest State Transitions

Approval Gate Lifecycle

Approval Request Content

The approval review screen presents the following context to the approver:

Field	Description	Source
Agent Name	Which agent is requesting approval	Agent configuration
Run ID	Execution identifier for the audit trail	`agent_executions.id`
Proposed Action	Tool name and full argument payload	Pending tool call
Reasoning Summary	Why the agent wants to take this action	LLM-generated explanation from the preceding reasoning turn
Risk Context	Policy evaluations, warnings, and flags	Governance Service output at time of gate
Historical Precedent	Similar past actions and their outcomes	Query of `approval_requests` filtered by same agent and tool
Confidence Score	Agent self-assessed confidence (where applicable)	LLM metadata included in tool call arguments

Approval Expiration

Approval requests have a configurable expiration timeout (default: 24 hours). Expiration is configured per workspace in approval_config.expiration_hours. When a request expires:

approval_requests.status transitions to expired
The Temporal workflow receives an expiry signal
The run terminates with execution_status = approval_expired
A notification is sent to the workspace: "Approval request expired for [agent name]"
Audit event tool.approval_expired is written with forced: false

The expiration timeout can also be configured with an escalation path: if the primary approver does not respond within escalation_timeout_minutes, a second notification is dispatched to escalation_channel before the full expiration fires.

Auto-Approve Evaluation

Auto-approve conditions are evaluated before the approval gate fires. If all conditions for a given tool are satisfied, the ApprovalRequest is created with status = AUTO_APPROVED (not pending) and immediately resolved. The execution does not pause.

{
  "auto_approve_conditions": [
    {
      "tool": "draft_response",
      "condition": "confidence_score > 0.95 AND response_length < 500"
    },
    {
      "tool": "create_ticket",
      "condition": "severity != 'critical'"
    }
  ]
}

The confidence_score field must be explicitly included in the tool call arguments by the reasoning engine. If absent, auto-approve evaluation fails safe and the gate fires as a standard pending request. AUTO_APPROVED is treated identically to approved for audit purposes -- a record is created and an audit event is emitted.

Approval Notification Routing

Notifications are dispatched by the Approval Notifier after the ApprovalRequest record is persisted. At least one notification channel must be configured for the workspace before an act_with_approval agent can be deployed.

Notification Channel	Trigger	Content
In-app (platform UI)	Always -- if approver has an active session	Approval card with tool name, arguments, reasoning summary, and risk context
Slack	If workspace has Slack connected and channel is configured	Message to configured channel with approve and reject deep links
Email	If workspace has email notifications enabled	Email to `assigned_approver_id` or all users with `agent:approve` in the workspace
PagerDuty	If `approval_config` specifies `pagerduty: true`	Alert with severity derived from the tool's risk classification

Policy Enforcement Engine

The Policy Engine evaluates governance_policies records bound to an agent version on every governance checkpoint. Policy evaluation runs after the autonomy level check passes and can override the action level by promoting a tool call to a gate or blocking it entirely.

Policy Types

Policy Type	Scope	Example	Enforcement
Data Export Limits	Organization	"No exports > 10,000 rows from PII tables"	Block
Time Restrictions	Workspace	"No production writes after 6 PM local time"	Block
Tool Restrictions	Agent	"This agent may not use `delete_data_source`"	Block
Rate Limits	Agent	"Max 100 API calls per hour"	Block + Alert
Content Policies	Organization	"No customer PII in outbound Slack messages"	Require Approval
Budget Policies	Workspace	"Max $50 in LLM token cost per run"	Warn + Log

Policy Condition Language

Policies use a structured expression language evaluated against the execution context at the time of each governance checkpoint.

Condition syntax:

WHEN <condition_expression>
THEN <enforcement_action>
[WITH <options>]

Available context variables:

Variable	Type	Description
`event.type`	string	Trigger type that started the execution (`manual`, `scheduled`, etc.)
`tool.name`	string	Name of the tool being evaluated
`tool.arguments`	object	Arguments as proposed by the LLM
`data.classification`	string	Classification of the data source (`public`, `internal`, `confidential`, `pii`, `phi`, `pci`)
`execution.turn_count`	int	Current turn number within the execution
`execution.tokens_consumed`	int	Cumulative token count for this execution
`time.hour`	int	Current UTC hour (0-23)
`time.day_of_week`	int	Current day (0 = Sunday, 6 = Saturday)
`user.role`	string	Role of the user who triggered the execution (manual triggers only)
`agent.consecutive_failures`	int	Number of consecutive failed executions for this agent
`cost.tokens`	int	Token count for the current turn

Example policies:

-- Prevent PII exports larger than 10K rows
WHEN tool.name = "execute_query" AND tool.arguments.row_limit > 10000 AND data.classification = "pii"
THEN block
WITH message = "PII exports exceeding 10,000 rows require a compliance review."

-- Gate all write tools outside business hours
WHEN tool.name IN ["update_ledger_status", "revoke_user_access"] AND time.hour NOT IN [9, 10, 11, 12, 13, 14, 15, 16]
THEN gate
WITH approver_role = "admin"

-- Alert on high token consumption without blocking
WHEN execution.tokens_consumed > 100000
THEN alert
WITH channel = "slack:#ops-oncall"

-- Auto-pause agent after repeated consecutive failures
WHEN agent.consecutive_failures >= 3
THEN block
WITH message = "Agent paused due to repeated consecutive failures. Review execution logs before re-enabling."

Enforcement Actions

Action	Behavior	Execution Continues?
`block`	Tool call is rejected; observation injected: "Policy blocked action: `{policy_name}`"	Yes -- LLM can adapt
`gate`	Approval gate fires regardless of `action_level` setting	Yes -- after approval
`log`	Tool call proceeds; additional audit event is emitted with policy name	Yes
`alert`	Tool call proceeds; notification sent to configured channel	Yes

Policy conflict resolution: When multiple policies apply to the same tool call, the most restrictive enforcement action wins. Priority order: block > gate > alert > log. If two block policies apply, both are logged in the audit trail but the execution receives a single blocked observation.

Policy Binding

Policies are bound to agent versions via agent_versions.policy_ids (a JSONB array of governance_policies.id values). Org-wide policies (those with workspace_id = NULL) are automatically evaluated for all agent versions in the org regardless of policy_ids. Workspace-specific policies must be explicitly added to policy_ids to apply.

5-Layer Defense-in-Depth Security

Five security layers are independently enforced. A bypass of one layer does not compromise the others. Failure at any layer terminates the request path with the appropriate error response and an audit event.

Layer 1: JWT Authentication and Claims Validation

Enforced by: TLO Gateway on every inbound request.

Every request to the River Agents API must carry a valid JWT in the Authorization: Bearer header. TLO Gateway validates the token signature, expiry, and required claims (org_id, workspace_id, user_id or agent_version_id for service-to-service calls). Requests with absent, expired, or malformed tokens receive 401 Unauthorized before reaching any backend service.

The agent runtime receives a sanitized user_context object -- never raw JWT tokens, database connection strings, or service credentials. TLO is the only component that holds the raw token; it strips it before forwarding.

Audit event: security.auth_failed -- emitted on every invalid token attempt with the token issuer and requested endpoint.

Layer 2: RBAC Role and Permission Check

Enforced by: TLO Gateway after token validation.

The user's role in the target workspace is resolved from the validated claims. The required agent: permission for the requested API endpoint is checked against the role-to-permission mapping. A valid token for a Workspace Analyst calling PATCH /api/v1/agents/{id} (requires agent:update) receives 403 Forbidden.

Audit event: security.permission_denied -- emitted with {user_id, role, required_permission, endpoint}.

Layer 3: Per-Tool ACL Validation

Enforced by: TLO Gateway at every tool dispatch from the Agent Runner.

Described in full in Per-Tool ACL Enforcement. This layer is specific to the service-to-service path (Agent Runner to TLO Gateway). It re-validates on every dispatch, not once per run. The token used in this path is the agent's service JWT bound to the specific agent_version_id, not the triggering user's session token.

Audit event: security.permission_denied with actor_type = 'agent' emitted on failure.

Layer 4: Pydantic Schema Validation and Prompt Injection Detection

Enforced by: river-agent microservice (:8007) on all inputs received from Backend.

The AgentContext bundle received by river-agent per turn is validated against a Pydantic schema before any LLM call is made. Invalid or unexpected fields are rejected with a structured error response -- the LLM is never invoked with malformed context.

trigger_payload content (which originates from external systems -- webhooks, API calls, cron payloads) is screened for prompt injection patterns before injection into the system prompt. Detected injection attempts are sanitized and flagged in agent_logs with log_type = 'error' and a status = 'injection_detected' annotation.

Prompt injection detection: The screening checks for instruction override phrases ("ignore previous instructions"), role-confusion sequences, and system prompt extraction attempts. Detection does not terminate the run by default -- it sanitizes the payload and logs the attempt. Whether to terminate is a policy-level decision configured per workspace.

Audit event: security.prompt_injection_detected -- emitted with {execution_id, trigger_type, sanitized_payload_hash}.

Layer 5: HITL Approval Gate

Enforced by: Temporal workflow (AgentRunWorkflow) for act_with_approval agents.

The final safety layer is the human decision point. Even if Layers 1-4 all pass, a tool call that requires approval cannot execute until a human explicitly approves it. The approval gate serializes execution state and hibernates the Temporal workflow at zero compute cost until the resolution signal arrives.

This layer is bypassed at fully_automated action level -- which is why the fully_automated governance policy requirement acts as a compensating control. An ungated fully_automated agent cannot be deployed without an explicit policy attestation.

Audit event: tool.approval_requested, tool.approved, tool.rejected, or tool.approval_expired -- one per approval gate lifecycle.

Security Layer Summary

Auditability and Traceability

Trust is built through transparency. The River Agents platform provides three levels of audit coverage: turn-level execution traces, data lineage records, and governance decision logs. All records are written to the audit_logs table, which is write-once and immutable at the application layer.

Turn-Level Trace

Every reasoning turn and tool call is persisted in agent_logs as the execution runs:

Captured Field	Description
Reasoning text	The LLM's chain-of-thought for each decision
Tool call + arguments	Exactly what action was requested
Tool result	Exactly what was returned
Model used	Which LLM model and tier handled this turn
Token count	Input and output tokens for cost attribution
Latency	Duration of each turn in milliseconds

Data Lineage

All data sources accessed during a run are recorded in the execution log. This enables the metric traceability feature: hovering over any number in an agent's output shows the exact query and raw data that produced it.

Data lineage captures:

Which databases and APIs were queried
The exact SQL or API call executed (stored in agent_logs.content)
A summary of results returned and the row count
How the result influenced the agent's subsequent reasoning (referenced by turn number)

Implementation note: Data lineage records are written immediately after each execute_query tool response, before the result is injected into the LLM context for the next turn. This ensures that even if a run fails mid-execution, the lineage record for the completed tool call is not lost.

Governance Decision Log

Every policy evaluation is recorded in audit_logs, regardless of outcome:

Which policies were evaluated and in what order
Whether each check passed or was triggered
If triggered: the violation details and the enforcement action taken
Approval decisions: who approved or rejected, at what time, and with what comment or modified arguments
Auto-approve decisions: which condition matched and the parameter values that satisfied it

Audit Event Taxonomy

Every governance decision, security enforcement, and approval gate transition is written to audit_logs before the decision is acted upon.

Agent Lifecycle Events:

Event Type	Actor Type	Required Payload Fields
`agent.created`	human	`agent_id`, `name`, `business_function`, `workspace_id`
`agent.deployed`	human	`agent_id`, `version_number`, `action_level`
`agent.paused`	human or system	`agent_id`, `previous_status`, `reason`
`agent.resumed`	human	`agent_id`
`agent.archived`	human	`agent_id`
`agent.health_changed`	system	`agent_id`, `previous_health`, `new_health`, `consecutive_failures`

Execution Events:

Event Type	Actor Type	Required Payload Fields
`execution.started`	agent	`execution_id`, `agent_version_id`, `trigger_type`
`execution.completed`	agent	`execution_id`, `status`, `turn_count`, `tokens_consumed`, `duration_ms`
`execution.failed`	agent	`execution_id`, `error_code`, `error_message`
`execution.cancelled`	human or system	`execution_id`, `cancelled_by`, `reason`

Tool and Governance Events:

Event Type	Actor Type	Required Payload Fields
`tool.called`	agent	`execution_id`, `turn_number`, `tool_name`, `governance_decision`
`tool.blocked`	system	`execution_id`, `turn_number`, `tool_name`, `block_reason` (`autonomy_level` or `policy_name`)
`tool.suggested`	system	`execution_id`, `turn_number`, `tool_name` (Recommend level only)
`tool.approval_requested`	system	`execution_id`, `approval_request_id`, `tool_name`, `tool_arguments`
`tool.auto_approved`	system	`execution_id`, `approval_request_id`, `tool_name`, `auto_approve_condition`
`tool.approved`	human	`execution_id`, `approval_request_id`, `resolved_by`, `resolution_note`
`tool.rejected`	human	`execution_id`, `approval_request_id`, `resolved_by`, `reason`
`tool.approval_expired`	system	`execution_id`, `approval_request_id`, `expires_at`

Policy Events:

Event Type	Actor Type	Required Payload Fields
`policy.violation`	system	`execution_id`, `policy_id`, `policy_name`, `tool_name`, `enforcement_action`
`policy.created`	human	`policy_id`, `name`, `enforcement_action`, `scope`
`policy.updated`	human	`policy_id`, `previous_version`, `new_version`
`policy.deactivated`	human	`policy_id`, `deactivated_by`

Security Events:

Event Type	Actor Type	Required Payload Fields
`security.auth_failed`	system	`endpoint`, `token_issuer`, `failure_reason`
`security.permission_denied`	system	`user_id` or `agent_version_id`, `required_permission`, `endpoint` or `tool_name`
`security.prompt_injection_detected`	system	`execution_id`, `trigger_type`, `sanitized_payload_hash`
`security.cross_tenant_access_attempt`	system	`requesting_org_id`, `target_org_id`, `endpoint`

Audit write ordering: Every audit event is written to audit_logs before the corresponding action or signal is dispatched. For approval gate events, tool.approval_requested is written before the Temporal workflow pauses. For governance blocks, tool.blocked is written before the blocked observation is returned to the reasoning engine.

Tenant Isolation

Tenant isolation is enforced at four independent points in the request path. All four must be operational simultaneously for full isolation.

RLS policy definition (representative):

-- Applied to agents table
CREATE POLICY agents_tenant_isolation ON river_agents.agents
  USING (org_id = current_setting('app.org_id')::INT);

If the Backend service fails to execute SET LOCAL app.org_id before issuing a query (for example due to a connection pool error), the RLS policy causes the query to return zero rows rather than all rows. This fail-closed behavior is intentional.

Cross-tenant detection: Any query returning rows with a mismatched org_id triggers security.cross_tenant_access_attempt in audit_logs and an alert to the security channel. This condition is architecturally prevented by the three upstream layers but is explicitly monitored as a defense-in-depth signal.

Cross-Tenant Protection by Layer

Layer	Protection Mechanism
API	TLO extracts `org_id` from JWT; all requests are scoped to that org before reaching Backend
Database	All queries filtered by `organization_id`; no cross-org JOINs are possible via RLS
Storage	MinIO bucket paths include `org_id` prefix; access tokens are scoped per org
Vector DB	Qdrant collections namespaced by `org_id`; no cross-namespace query possible
Temporal	Workflow IDs prefixed with `org_id`; task queues isolated per organization
WebSocket	Subscription channels filtered by `org_id`; events are not broadcast cross-tenant

Agent Cross-Boundary Prevention

An agent instance cannot cross its organization boundary at any point during execution:

Can only access data sources belonging to its organization
Can only be triggered by users in its organization
Can only be approved by users in its workspace
Cannot reference or call tools in other organizations
Cannot access audit logs from other organizations
Cannot send notifications to channels outside the workspace's configured notification targets

Attempts to cross these boundaries are blocked at Layer 3 (ACL) or Layer 1 (JWT claims mismatch), with security.cross_tenant_access_attempt logged in both cases.

Emergency Controls

The following emergency actions are available to Organization Administrators and Workspace Admins (as noted). All actions emit audit events before executing.

Immediate Execution Stop

Any running agent execution can be immediately stopped:

UI: "Stop" button on the Run Detail page
API: POST /api/v1/agents/runs/{id}/stop
Effect: Temporal workflow is cancelled, the current tool call is aborted, and execution_status transitions to stopped
Audit event: execution.cancelled with reason: emergency_stop

Auto-Pause Triggers

The system automatically pauses an agent (transitions agent_status to paused) when:

Trigger Condition	Threshold	Action
Consecutive failures	3+ consecutive failed executions	Agent paused; `agent.health_changed` event with `new_health: critical`
Token budget exceeded	Per-run or per-day token limit hit	Execution terminated; agent flagged for review
Rate limit breach	Agent triggering more frequently than the configured `max_executions_per_hour`	Trigger queue drained; agent paused
Critical policy violation	Agent attempts a policy-blocked action 3+ times in a single execution	Execution cancelled; agent paused; alert sent

Auto-paused agents require manual re-activation by a Workspace Admin. The reason for auto-pause is surfaced on the agent detail page and included in the agent.paused audit event.

Workspace Kill Switch

Workspace Admins can globally pause all agents in a workspace:

API: POST /api/v1/workspaces/{id}/agents/pause-all
Effect: All active agents in the workspace transition to paused; queued executions are cancelled; in-flight executions run to end of current turn then pause
Use cases: security incidents, platform maintenance windows, compliance holds
Audit event: governance.emergency_pause with the acting user_id and timestamp

Emergency Control Reference

Control	API Endpoint	Requires Role	Effect	Audit Event
Pause all agents	`POST /api/v1/governance/emergency/pause-all`	Org Admin	Pauses all `active` agents across the org	`governance.emergency_pause`
Kill running execution	`POST /api/v1/agents/runs/{id}/stop`	Workspace Admin	Cancels Temporal workflow; run terminates with `stopped`	`execution.cancelled`
Revoke agent deployment	`PATCH /api/v1/agents/{id}` `{status: "paused"}`	Workspace Admin	Agent pauses after current turn completes	`agent.paused`
Emergency policy override	`POST /api/v1/governance/emergency/policy`	Org Admin	Applies a temporary org-wide policy superseding all agent-level policies	`policy.created` with `scope: emergency`
Force-expire approval	`POST /api/v1/approvals/{id}/expire`	Workspace Admin	Transitions `pending` approval to `expired`; run terminates	`tool.approval_expired` with `forced: true`
Workspace kill switch	`POST /api/v1/workspaces/{id}/agents/pause-all`	Workspace Admin	Pauses all active agents in the workspace	`governance.emergency_pause`

Emergency policy override constraints: Emergency policies have a mandatory expires_at field (maximum 72 hours). They are evaluated first in policy conflict resolution, overriding all other policies including org-wide ones. They must use enforcement_action = 'block' or 'gate' -- 'log' and 'alert' are not valid enforcement actions for emergency policies.

Compliance Capabilities

Standard	River Agents Capability
SOC 2 Type II	Write-once `audit_logs` table with no application-layer mutation or delete; RBAC with role-to-permission mapping; JWT authentication on all API calls; TLS 1.2+ in transit; PostgreSQL tablespace encryption at rest; immutable `agent_versions` history
GDPR	Column-level data minimization via `schema_filter` in `agent_version_data_sources` (restricts visible columns per agent); `audit_logs` retention configurable per org; workspace region enforcement for data residency; right-to-erasure supported via `audit_logs.actor_user_id` set to NULL on request with `event_payload` hash replacing identifying content
HIPAA	PHI access controls via `data.classification = 'phi'` policy conditions; Business Associate Agreement support via org-level compliance flag; `audit_logs` retention configurable to 6-year minimum; workspace-level data residency for US-only storage; all PHI data access logged with exact query and accessor identity
PCI DSS	Cardholder data masking -- `execute_query` tool output is screened for PANs and masked before injection into LLM context; explicit access logging for all queries against `data.classification = 'pci'` sources; workspace scoping prevents cross-environment tool calls; whitelist-only tool configuration
SOX	Change control via immutable `agent_versions` -- every configuration change creates a new version record with `created_by` and timestamp; approval gates for financial write actions (`update_ledger_status`, `issue_refund`); complete governance decision log; version rollback requires a new deploy with no in-place version mutation
ISO 27001	Information security policy binding via `governance_policies`; five-layer defense-in-depth architecture; access management via RBAC with five workspace roles; incident response integration via PagerDuty alerts on `security.*` audit events; prompt injection detection (Layer 4) as a compensating control for AI-specific threat vectors

Governance Architecture Overview​

4-Level Autonomy Model​

Risk Assessment per Level​

Action Level Enforcement Matrix​

Action Level Progression​

Per-Tool ACL Enforcement​

ACL Permission Namespace​

Tool Execution Permission Matrix​

Two-Tier Role System​

Role-to-Permission Mapping​

TLO ACL Enforcement Sequence​

Approval Gate Mechanics​

ApprovalRequest State Transitions​

Approval Gate Lifecycle​

Approval Request Content​

Approval Expiration​

Auto-Approve Evaluation​

Approval Notification Routing​

Policy Enforcement Engine​

Policy Types​

Policy Condition Language​

Enforcement Actions​

Policy Binding​

5-Layer Defense-in-Depth Security​

Layer 1: JWT Authentication and Claims Validation​

Layer 2: RBAC Role and Permission Check​

Layer 3: Per-Tool ACL Validation​

Layer 4: Pydantic Schema Validation and Prompt Injection Detection​

Layer 5: HITL Approval Gate​

Security Layer Summary​

Auditability and Traceability​

Turn-Level Trace​

Data Lineage​

Governance Decision Log​

Audit Event Taxonomy​

Tenant Isolation​

Cross-Tenant Protection by Layer​

Agent Cross-Boundary Prevention​

Emergency Controls​

Immediate Execution Stop​

Auto-Pause Triggers​

Workspace Kill Switch​

Emergency Control Reference​

Compliance Capabilities​