Skip to main content

Core Services Specification

River Agents are implemented across 9 backend services. Services 1, 2, 3, 5, 7, 8, and 9 run as internal modules within Backend :8005. Service 4 (Reasoning) is a separate stateless process at river-agent :8007. Service 6 (Tool Invocation) describes the tool dispatch path that routes through TLO Gateway :8001. This document specifies the API contracts, internal components, owned tables, and inter-service dependencies for each.

Quick Navigation

Service Dependency Map

Service Dependency Map

Ownership boundary: Backend :8005 owns the river_agents PostgreSQL schema and is the only process that writes to it. river-agent :8007 is stateless -- it receives all necessary state in the request body and has no direct database connection.


Service 1: Agent Management

Owns all persistent agent state. Every other service reads agent configuration from records this service writes.

Owns: river_agents.agents, river_agents.agent_versions, river_agents.agent_version_tools, river_agents.agent_version_data_sources, river_agents.agent_templates

Internal Components

ComponentResponsibility
AgentServiceCRUD business logic for the agents table. Enforces org/workspace scoping on all reads and writes.
LifecycleStateMachineThe single code path for all agents.status writes. Validates pre-conditions before any transition. No other component may update agents.status directly.
AgentValidatorRuns the validation job triggered by POST /api/v1/agents/{id}/validate. Tests data source connectivity, checks that all bound tools exist in the registry, resolves governance policy references, and validates trigger config syntax.
VersionManagerCreates an immutable agent_versions snapshot on every post-deployment configuration change. Updates agents.current_version_id on deploy and rollback. Enforces the partial unique index that allows only one active version per agent.
TemplateServiceManages agent_templates records and the conversion of a template into a pre-filled agent configuration for the creation wizard.

API Endpoints

MethodEndpointPermissionDescription
GET/api/v1/agentsagent:readList agents in workspace with search, status filter, and pagination
POST/api/v1/agentsagent:createCreate agent in draft status
GET/api/v1/agents/{id}agent:readFull agent detail including current version, health status, and recent run summary
PUT/api/v1/agents/{id}agent:updateUpdate configuration -- creates a new version if agent is deployed or active
DELETE/api/v1/agents/{id}agent:deleteSoft delete -- transitions to archived
POST/api/v1/agents/{id}/validateagent:updateRun async validation checks; returns validation result with field-level errors
POST/api/v1/agents/{id}/deployagent:deployDeploy validated agent; registers triggers and transitions to active
POST/api/v1/agents/{id}/pauseagent:deploySuspend all triggers; transition to paused
POST/api/v1/agents/{id}/resumeagent:deployRe-activate triggers; transition back to active
POST/api/v1/agents/{id}/archiveagent:deleteRetire agent; preserve for audit; block all future runs
GET/api/v1/agents/{id}/versionsagent:readList all version records for an agent
POST/api/v1/agents/{id}/versions/{vid}/rollbackagent:deployUpdate current_version_id to a previous version; emits audit event
GET/api/v1/agent-templatesagent:readList available agent templates
GET/api/v1/agent-templates/{id}agent:readFull template detail including configuration defaults
POST/api/v1/agents/from-template/{template_id}agent:createCreate pre-filled agent from template
POST/api/v1/agents/generateagent:createAI-powered agent generation from natural language via river-agent :8007

Implementation Notes

Validation is asynchronous. The POST /validate endpoint queues a background job and returns a 202. The frontend polls the agent detail endpoint until validation_status is passed or failed. Validation can take up to 30 seconds for agents with multiple data sources.

PUT on a deployed agent does not modify the active version. It creates a new version with status = 'draft'. The agent continues running on the existing active version until the operator explicitly deploys the new one. This prevents configuration changes from affecting in-flight runs.

Rollback does not create a new version. It updates current_version_id and emits an agent_version_rolled_back audit event. The version history is preserved intact.


Service 2: Trigger Ingestion

Normalizes all trigger types into a uniform ExecutionRequest and routes to the Agent Execution Runner. This service has no knowledge of what happens during a run -- its only job is to ingest, validate, and enqueue.

Owns: river_agents.agent_triggers, river_agents.trigger_rejection_log

Internal Components

ComponentHandlesIngestion Mechanism
ManualTriggerHandlermanualHTTP POST from frontend or external API call
ScheduledTriggerHandlerscheduledInternal scheduler polls agent_triggers for due expressions every 10 seconds; handles timezone normalization
EventTriggerHandlereventKafka consumer group river-agents-events; evaluates event_type and optional payload_conditions filter per agent
APITriggerHandlerapiAuthenticated endpoint; validates API key, enforces per-agent rate limits, optionally validates payload against trigger_config.payload_schema
ThresholdTriggerHandlerthresholdSubscribes to metric stream from Service 8; evaluates threshold_expression; enforces cooldown_seconds to prevent re-fire storms
WorkflowTriggerHandlerworkflowListens for Temporal workflow signals matching trigger_config.workflow_id and signal_name
TriggerDispatcherAllNormalizes all handler outputs into ExecutionRequest; validates agent is active; enforces concurrency policy; enqueues to Execution Queue

API Endpoints

MethodEndpointPermissionDescription
POST/api/v1/agents/{id}/runagent:executeManual trigger -- immediately enqueues execution
POST/api/v1/agents/{id}/triggersagent:updateRegister a new trigger configuration for an agent
GET/api/v1/agents/{id}/triggersagent:readList all trigger configurations for an agent
PUT/api/v1/agents/{id}/triggers/{tid}agent:updateUpdate trigger configuration
DELETE/api/v1/agents/{id}/triggers/{tid}agent:updateRemove trigger
POST/api/v1/agent-webhooks/{agent_id}API key authInbound webhook endpoint for event-based triggers
POST/api/v1/agent-api/{agent_id}/executeAPI key authAuthenticated execution endpoint for external system triggers

ExecutionRequest Schema

Every trigger normalizes to this structure before the Dispatcher enqueues it.

FieldTypeDescription
request_idUUIDIdempotency key; duplicate request_id values are rejected by the Runner
agent_idUUIDTarget agent
agent_version_idintegerPinned at ingestion time from agents.current_version_id; immutable for the run
trigger_typeenummanual, scheduled, event, api, threshold, workflow
trigger_sourcestringAudit label (e.g., "cron:daily-8am", "webhook:zendesk-ticket")
trigger_payloadJSONBContextual data from the trigger source; passed to river-agent as part of the context bundle
requested_byUUIDUser ID for manual/API triggers; service identifier for automated triggers
requested_attimestamptzIngestion timestamp; used for SLA and latency tracking

Implementation Notes

Concurrency is enforced at dispatch time. If agent_versions.runtime_config.allow_concurrent_runs is false and a run is already in running status, the TriggerDispatcher applies the on_concurrent_trigger policy: queue holds the request until the current run finishes, drop discards it with a rejection log entry, and replace cancels the running execution and starts fresh.

Event triggers use payload condition matching. The EventTriggerHandler evaluates trigger_config.payload_conditions as a JSON path expression against the incoming webhook payload. Agents with overlapping event type subscriptions are each evaluated independently.


Service 3: Agent Execution Runner

The central coordinator for every agent run. Bridges the normalized trigger from Service 2 to the reasoning engine in Service 4, manages approval gate hibernation, enforces execution limits, and finalizes the run record.

Owns: river_agents.agent_executions, river_agents.agent_memory

Internal Components

ComponentResponsibility
AgentRunWorkflowThe root Temporal workflow. Owns the full execution lifecycle from running to a terminal state (completed, failed, cancelled, budget_exhausted). All execution state lives in this workflow and the linked DB records.
ContextBuilderAssembles the AgentContext bundle before the first reasoning call and after each approval gate resume. Loads agent version config, long_term_context, trigger payload, data source schema metadata, and the active tool registry.
StateSerializerOn gate: writes agent_executions.serialized_state (JSONB containing conversation history, pending tool call, observation buffer, memory snapshot) and transitions execution to paused. On resume: deserializes and reconstructs AgentContext from the stored snapshot.
RetryManagerApplies Temporal retry policy per activity: 3 retries with exponential backoff for transient failures. Marks tool calls as permanently failed after retry exhaustion.
TimeoutManagerEnforces per-tool timeout (default 30 seconds) and per-run turn limit (default 15 turns). A run that exhausts turns without reaching finalization terminates with budget_exhausted status.

API Endpoints

MethodEndpointPermissionDescription
GET/api/v1/agent-runsagent:readList runs with filters: agent, status, trigger type, date range, pagination
GET/api/v1/agent-runs/{id}agent:readFull run detail: status, timing, token usage, final output
POST/api/v1/agent-runs/{id}/stopagent:executeEmergency cancel a running execution; sends cancellation signal to Temporal workflow
POST/api/v1/agent-runs/{id}/retryagent:executeRe-enqueue a failed run using the same trigger payload and version
GET/api/v1/agent-runs/{id}/logsagent:readPaginated turn-level log entries for a run
GET/api/v1/agent-runs/{id}/streamagent:readWebSocket upgrade -- real-time run progress events

Implementation Notes

The Temporal workflow ID is derived from execution_id. The pattern is agent-run-{execution_id}. This means the workflow can be looked up by execution ID directly from Temporal's API without a separate index table. It also means the execution record must be created before the Temporal workflow is started -- the agent_executions.id is the source of truth.

Approval signals go directly to Temporal, not through the REST layer. When an operator calls PATCH /api/v1/approvals/{id}, Service 7 updates the approval_requests record and sends a signal via the Temporal client SDK: temporal.signal_workflow(workflow_id="agent-run-{execution_id}", signal="approval_resolved", payload={...}). The REST endpoint does not block waiting for the workflow to resume.

Memory write-back is non-blocking. After finalization, the ContextBuilder writes the memory_updates delta from the reasoning result back to river_agents.agent_memory. This write happens asynchronously after the run record is marked completed -- a write failure does not change the run's terminal status.


Service 4: Reasoning Service

The LLM reasoning engine. This service is implemented as a separate FastAPI process at river-agent :8007. It is entirely stateless -- all state arrives in the request body and all outputs are returned in the response. It is called once per reasoning turn by Service 3.

Owns: No database tables. Stateless.

Internal Components

ComponentResponsibility
AgentLoopEntry point. Receives the AgentContext bundle, runs one Reason -> Act -> Observe iteration, and returns a structured ReasoningResult containing the next tool call or a finalization signal.
SystemPromptBuilderDynamically composes the system prompt per turn by assembling sections from the AgentContext. The prompt is rebuilt on every turn to reflect the current observation state.
RiverCoreMulti-provider LLM router. Classifies the turn complexity, selects the appropriate model tier and provider, executes the inference call, and handles failover to the next provider in the chain on error.
ToolCallParserParses the raw LLM output into a structured ToolCall object (tool name + validated arguments). Rejects malformed tool selections before returning to Service 3.

System Prompt Composition

The SystemPromptBuilder assembles the system prompt from these sections in order:

SectionContentSource
Role Definition"You are {agent.name}, a {agent.business_function} agent..."agent_versions.name, business_function
Goal and InstructionsThe agent's natural language instruction setagent_versions.instruction_set
Available ToolsJSON schema definitions for all tools in the active tool registryTool registry filtered by agent_versions.selected_tools
Data ContextSchema metadata for all connected data sourcesAssembled by ContextBuilder from Service 5
Governance ConstraintsAction level and applicable policy constraints in natural languageagent_versions.action_level, resolved policies
Long-Term MemoryStructured summary of learnings from past runsagent_memory.context_snapshot
Trigger ContextThe trigger type and payload for this runExecutionRequest.trigger_payload
Conversation HistoryAll prior turns in this execution (reasoning + observations)Accumulated in AgentContext across turns

Implementation Notes

Service 3 calls Service 4 once per turn, not once per run. The request body contains the full AgentContext at the current turn's state. This is intentional -- it keeps river-agent stateless and allows Service 3 (via Temporal) to be the durable state holder. The tradeoff is larger request payloads on longer runs.

river-agent does not know its action level. The tool registry it receives in the context bundle is already filtered by Service 3 based on action level. If a tool requires approval, that check happens in Service 7 after river-agent returns its tool selection. From river-agent's perspective, it selects from the tools it was given -- governance enforcement is downstream.

Finalization is signaled by a special tool call. When river-agent determines the goal is reached, it returns a ToolCall with tool_name = "finalize" and a structured final_output in the arguments. Service 3 detects this and begins the run finalization sequence without invoking Service 6.


Service 5: Data Access and Schema

Provides the reasoning engine with schema-aware, ACL-governed access to all connected data sources. All reads pass through TLO Gateway -- this service never connects directly to external data sources.

Owns: No dedicated tables. Reads from platform.data_sources (cross-schema). Schema metadata is cached in Redis with per-source TTL.

Internal Components

ComponentResponsibility
SchemaDiscoveryServiceRetrieves and caches table, column, and relationship metadata for connected data sources. Cache TTL is configurable per data source (default: 3600 seconds). Force-refresh is triggered by POST /data-sources/{id}/discover-schema.
QueryExecutionEngineTranslates a validated query spec into a Data Orchestration Service request. Handles result pagination, type normalization, and error classification (user error vs. connectivity error vs. timeout).
SemanticCatalogVector search over data source metadata in Qdrant. Used by the reasoning engine to resolve table/column references by semantic meaning when exact names are unknown.
DataConnectorProxyRoutes all data source interactions through TLO Gateway to the Data Orchestration Service. Injects the X-Agent-ID and X-Execution-ID headers for audit trail correlation in the downstream service.

API Endpoints

MethodEndpointPermissionDescription
GET/api/v1/data-sources/{id}/schemadata_source:viewRetrieve cached schema metadata for a connected data source
POST/api/v1/query/executedata_source:queryExecute a query against a connected data source via Data Orchestration
GET/api/v1/catalog/searchdata_source:viewSemantic search over data source metadata using Qdrant vectors
POST/api/v1/data-sources/{id}/testdata_source:viewTest connectivity to a data source

Supported Data Source Types

CategorySources
SQL DatabasesPostgreSQL, MySQL, SQL Server, Snowflake, BigQuery, Redshift
NoSQL DatabasesMongoDB, DynamoDB, Elasticsearch
SaaS APIsSalesforce, HubSpot, Zendesk, Stripe, Shopify
File StorageCSV and Excel files in MinIO, Google Sheets
Custom APIsAny REST API with a registered OpenAPI specification

Implementation Notes

Schema metadata is cached, not live. The reasoning engine sees a schema snapshot, not the live database state. Stale schema causes query generation errors at execution time. The SchemaDiscoveryService detects schema-related query errors and triggers a background refresh for the affected data source.

ACL is checked per call at TLO Gateway. A data_source:view permission is required for schema discovery; data_source:query is required for query execution. These checks happen at TLO on every call. A data source permission revoked mid-run takes effect on the next tool invocation within that run.


Service 6: Tool and Workflow Invocation

Describes the path a tool call takes from the reasoning result to execution on a target service. This is not a standalone process -- it is the pattern implemented by Service 3 when dispatching an approved tool call through TLO Gateway.

Owns: No tables. Operates as a dispatch path within Service 3.

Tool Execution Steps

StepActionOwner
1river-agent selects tool and returns structured ToolCallService 4
2Service 3 validates arguments against tool's Pydantic input schemaService 3 / Tool Registry
3Service 3 calls Service 7 for action level and policy checkService 7
4If gated: approval flow; if blocked: turn error; if allowed: proceedService 7
5Service 3 sends tool dispatch request to TLO Gateway with governance tokenService 3 -> TLO
6TLO validates JWT and per-tool ACL permissionTLO Gateway
7TLO routes to target service (Backend, Data Orchestration, external API)TLO Gateway
8Target service executes and returns resultTarget service
9Service 3 receives result; passes to ResultValidatorService 3
10Validated result is formatted as an observation and injected into the next AgentContextService 3

Tool Registry

All tools available to River Agents are registered in the Tool Registry. Engineers adding new tools must define:

  1. A unique tool_name identifier
  2. A Pydantic input schema (used for argument validation at step 2 and for prompt injection at system prompt build time)
  3. A Pydantic output schema (used for result validation at step 9)
  4. A write_classified boolean (determines whether the tool triggers action level checks in Service 7)
  5. A required TLO ACL permission string (e.g., "agent:execute", "data_source:query")
  6. A target service route (Backend, Data Orchestration, or external service URL)

Custom tools registered via OpenAPI spec upload are validated against this schema at registration time. A spec that cannot be mapped to a valid tool definition is rejected.

Workflow Invocation

For operations that span multiple steps or services, Service 3 can invoke a Temporal sub-workflow rather than a single tool call. The WorkflowInvoker starts a child workflow using the workflow_id specified in agent_versions.selected_workflows and passes the reasoning engine's arguments as the workflow input. The execution observes the workflow execution ID and optionally awaits the result if the parent workflow is configured to wait.


Service 7: Governance and Approval

The enforcement boundary between what an agent reasons it should do and what it is permitted to do. Every write-capable tool call passes through this service. No tool dispatch occurs without this service's sign-off.

Owns: river_agents.approval_requests, river_agents.governance_policies, river_agents.agent_policy_bindings

Internal Components

ComponentResponsibility
ActionLevelCheckerEvaluates a proposed tool call against the agent's action_level. Returns execute, stage_only, or gate as the enforcement outcome.
PolicyEngineEvaluates the current execution context (agent identity, workspace, tool name, tool arguments, data classification) against all bound governance policies. Returns allow, block, gate, or alert per matching policy.
ApprovalGateServiceCreates approval_requests records, assigns approvers from approval_rules, and sends the approval signal to the Temporal workflow on resolution.
ApprovalNotifierDispatches approval request notifications via Novu to configured channels: Slack, email, in-app, or PagerDuty. Handles escalation if no response within approval_rules.timeout_hours.

API Endpoints

MethodEndpointPermissionDescription
GET/api/v1/approvalsagent:approveList approval requests with status, agent, and date filters
GET/api/v1/approvals/{id}agent:approveFull approval detail: proposed action, reasoning context, risk assessment
PATCH/api/v1/approvals/{id}agent:approveResolve: approve, reject, or edit_and_approve with modified arguments
GET/api/v1/approvals/pendingagent:approveCount of pending approvals (used for badge display)
GET/api/v1/agents/{id}/policiesagent:readList governance policies bound to an agent
POST/api/v1/agents/{id}/policiesagent:updateBind a governance policy to an agent
DELETE/api/v1/agents/{id}/policies/{pid}agent:updateRemove a policy binding

Action Level Enforcement Matrix

Agent action_levelTool write_classifiedOutcome
read_onlyfalse (read tool)Execute
read_onlytrue (write tool)Block -- write operations not permitted at this level
recommendfalse or trueStage as proposal -- no execution; returned to user as recommendation
act_with_approvalfalse (read tool)Execute
act_with_approvaltrue -- tool in approval_rulesGate -- create ApprovalRequest, pause execution, notify approver
act_with_approvaltrue -- tool not in approval_rulesExecute
automatedfalse or trueExecute -- all configured tools run without gates

Approval Request Lifecycle

Approval Request State Machine

Implementation Notes

Policy evaluation happens after action level check. A tool call blocked at the action level never reaches the PolicyEngine. The policy engine evaluates only calls that pass the action level check. This ordering means action level is always the outer constraint.

edit_and_approve substitutes arguments entirely. When an approver uses edit-and-approve, the original ToolCall arguments are discarded and the approver's revised arguments are used. The approval_requests record stores both the original and modified arguments for audit. The ResultValidator in Service 6 re-validates the modified arguments against the tool's input schema before dispatch.

Every governance decision -- including execute outcomes -- emits an audit event. Service 7 calls Service 9 on every enforcement outcome, not only on blocks and gates. This ensures the audit trail is complete for compliance purposes.


Service 8: Monitoring and Telemetry

Collects and aggregates runtime telemetry across all agent executions. Powers the system-wide monitoring dashboard, per-agent metrics, real-time execution views, and the alert engine.

Owns: river_agents.agent_metrics_hourly, river_agents.agent_alerts

Internal Components

ComponentResponsibility
TelemetryEmitterWrites structured turn-level events to the WebSocket channel during active runs. Called by Service 3 at each turn boundary, tool dispatch, and approval gate event.
MetricAggregatorAggregates execution outcomes into rolling windows (1h, 24h, 7d, 30d) and writes to agent_metrics_hourly. Runs as a background job after each run completion.
HealthEvaluatorEvaluates agent_metrics_hourly on a 5-minute cycle to derive agents.health_status (healthy, degraded, critical, unknown). Degraded threshold: success rate below 80% over 24 hours. Critical threshold: below 50%.
AlertEngineMonitors per-agent and system-wide metrics against configured alert rules. Dispatches to Novu on threshold breach. Implements a cooldown period to prevent alert floods for sustained degradation.

API Endpoints

MethodEndpointPermissionDescription
GET/api/v1/agents/{id}/metricsagent:readPer-agent metrics for a configurable time window
GET/api/v1/monitoring/overviewagent:monitorSystem-wide: total agents, active runs, throughput, health distribution
GET/api/v1/monitoring/throughputagent:monitorTime-series throughput data for the system chart
GET/api/v1/monitoring/alertsagent:monitorRecent alert stream with severity and agent reference
GET/api/v1/monitoring/clusteragent:monitorRuntime instance health table
WS/ws/agent-runs/{id}agent:readReal-time run progress events for a specific execution
WS/ws/monitoringagent:monitorSystem-wide real-time monitoring event stream

Metric Definitions

MetricAggregationScopeStorage
Total RunsCountPer-agent, System-wideagent_metrics_hourly.run_count
Success Ratecompleted / total as percentagePer-agentagent_metrics_hourly.success_rate
Average LatencyP50, P90, P99 in millisecondsPer-agent, Per-toolagent_metrics_hourly.latency_p50/p90/p99
Failure CountCount with categorized reasonsPer-agent, System-wideagent_metrics_hourly.failure_count + failure_reasons JSONB
ThroughputRuns per hourSystem-wideDerived from agent_metrics_hourly at query time
Pending ApprovalsCount of approval_requests where status = 'pending'Per-agent, System-wideQueried live from approval_requests
Token CostTotal tokens multiplied by model unit pricingPer-agent, Per-runagent_executions.token_usage JSONB
Actions TakenCount grouped by tool_namePer-agentAggregated from agent_logs at query time

WebSocket Event Types Emitted

EventPayload FieldsEmitted When
run_startedexecution_id, agent_id, started_atExecution begins
turn_reasoningturn, content, model_used, tokensReasoning turn complete
tool_calledturn, tool_name, inputsTool dispatch initiated
tool_resultturn, tool_name, output, duration_msTool result received
approval_requestedapproval_id, tool_name, pending_sinceApproval gate triggered
approval_resolvedapproval_id, resolution, resolved_byApproval gate resolved
run_completedexecution_id, status, final_output, duration_msExecution finalized
run_failedexecution_id, error_code, error_messageExecution terminated with error

Service 9: Execution Logging and Audit

The write-once source of truth for all historical agent behavior. Provides queryable, filterable, and exportable audit trails for debugging, compliance, and analysis.

Owns: river_agents.audit_logs, river_agents.agent_logs

Internal Components

ComponentResponsibility
AuditWriterSingle entry point for all writes to audit_logs. Enforces write-once semantics at the application layer -- no UPDATE or DELETE is permitted on this table through this component.
ExecutionLogWriterWrites turn-level records to agent_logs during active runs. Called by Service 3 after every reasoning turn, tool call, and approval gate event.
AuditQueryServiceProvides the read interface for both audit_logs and agent_logs. Handles filter compilation, pagination, and export formatting.
RetentionManagerApplies retention policies: archives or deletes audit records older than the configured retention period per organization. Retention periods are stored in iam.organizations.audit_retention_days.

API Endpoints

MethodEndpointPermissionDescription
GET/api/v1/audit/logsagent:auditSearch audit logs with filters: agent, user, event type, date range, outcome, pagination
GET/api/v1/audit/logs/{id}agent:auditFull audit entry with complete payload detail
GET/api/v1/audit/statsagent:auditAggregate counts: total events, by type, by agent, by user
GET/api/v1/audit/exportagent:auditExport filtered audit logs as CSV or JSON for compliance reporting
GET/api/v1/agent-runs/{id}/traceagent:readComplete turn-by-turn execution trace for a single run

Audit Log Entry Schema

FieldTypeDescription
log_idUUIDImmutable identifier; set at write time and never changed
timestamptimestamptzEvent time; indexed for range queries
event_typeenumagent.created, agent.deployed, agent.archived, run.started, run.completed, run.failed, tool.executed, tool.blocked, approval.requested, approval.resolved, policy.violated, permission.denied
agent_idUUIDRelated agent (nullable for non-agent events)
execution_idUUIDRelated run (null for non-execution events)
user_idUUIDActor who initiated the event; system service ID for automated events
actionstringSpecific action description (e.g., "deploy_agent", "execute_tool:update_crm")
payloadJSONBFull request context and response (scrubbed of secrets)
outcomeenumsuccess, failure, blocked, pending
ip_addressinetSource IP from the originating request; populated from TLO Gateway propagation header
organization_idUUIDTenant isolation key; all queries must filter by this

Implementation Notes

audit_logs has no UPDATE or DELETE paths in the application code. The write-once constraint is enforced by AuditWriter -- there is no update_audit_log or delete_audit_log method. At the database layer, this is reinforced by a row-level security policy that grants INSERT only (no UPDATE, no DELETE) to the Backend service's database role.

Turn-level logs in agent_logs are separate from audit logs. agent_logs stores the detailed reasoning trace (turn content, tool arguments, observations) and is queried for debugging and run detail views. audit_logs stores governance events, lifecycle transitions, and compliance-relevant actions. The two tables serve different consumers: agent_logs is for operators debugging a run; audit_logs is for compliance and security review.

Export uses streaming for large result sets. The GET /api/v1/audit/export endpoint streams the response rather than buffering the full result set in memory. Callers should handle chunked transfer encoding. Result sets over 100,000 rows are automatically split into multiple files in a ZIP archive.