The stateless FastAPI reasoning engine at :8007 that receives a fully assembled execution context, runs the multi-turn agentic loop (Reason, Act, Observe), and returns structured results to the calling Temporal workflow activity.
Quick Navigation
Service Overview
| Attribute | Value |
|---|
| Service name | river-agent |
| Port | 8007 |
| Framework | FastAPI (Python 3.12, async) |
| Container | docker/river-agent/Dockerfile |
| Statefulness | Stateless -- no database, no Redis, no local storage between invocations |
| Caller | Temporal workflow activity (via TLO Gateway :8001) |
| Dependencies | TLO Gateway :8001, LLM provider APIs (external) |
| Horizontal scaling | Fully supported; any replica handles any request |
| Health check | GET /health |
The river-agent service is the reasoning brain of every agent execution. It receives a fully assembled ExecutionRequest from the calling Temporal activity, runs the iterative Reason, Act, Observe loop internally, and returns a structured ExecutionResponse when the loop terminates. It is a pure function: given the same input context, it produces deterministic structured output through LLM reasoning and tool calls.
Service Boundaries
The river-agent service does NOT:
- Store any state -- no database connections, no Redis, no persistent storage of any kind
- Access the database directly -- all data retrieval flows through TLO Gateway tool calls
- Manage agent lifecycle -- agent CRUD, versioning, deployment, and state machine transitions are owned by the backend
- Process triggers -- trigger ingestion, scheduling, and dispatch are handled by the Trigger Ingestion Service in the backend
- Manage approvals -- approval gate creation, notification dispatch, and resolution handling belong to the Governance and Approval Service in the backend
- Maintain WebSocket connections -- real-time telemetry is emitted as structured events to the calling Temporal activity, which forwards them to TLO Gateway for WebSocket delivery
- Store execution logs -- log persistence is owned by the Execution Logging and Audit Service in the backend
- Own any credentials -- database connection strings, API keys, and user tokens are never exposed to the river-agent service; it receives only sanitized user context
Execution Model
river-agent runs the entire agentic loop for one execution in a single HTTP call. The caller -- a Temporal workflow activity in Backend :8005 -- sends a fully assembled ExecutionRequest and blocks until river-agent returns a complete ExecutionResponse. There are no callbacks, no streaming responses, and no mid-execution control flow between river-agent and its caller.
The context bundle passed in the request is the sole source of truth for the execution. river-agent reads no database, calls no external services directly, and holds no in-memory state between invocations. Every piece of information it needs -- the agent's instruction set, the active tool registry with Pydantic schemas, connected data source metadata, governance constraints, accumulated long-term memory, trigger payload, and conversation history -- is present in the ExecutionRequest.
Key behaviors per execution:
| Attribute | Value |
|---|
| Trigger | Single HTTP POST from Temporal workflow activity via TLO Gateway |
| Context source | ExecutionRequest.agent_config, user_context, data_source_metadata, tool_inventory, long_term_context, trigger_payload, conversation_history |
| System prompt | Assembled per-execution: agent instruction set, allowed tool schemas, data source metadata, governance constraints, accumulated memory |
| Tool set | 6 reasoning tools (local) + 10 execution tools (via TLO Gateway) + 1 interaction tool |
| Turn budget | Configurable per agent via max_turns (default: 15); enforced by river-agent loop guard |
| Governance enforcement | Governance Checker validates every tool call before dispatch; blocked calls are returned as observations, not silently dropped |
| Output destination | ExecutionResponse returned to Temporal activity, which persists results, emits WebSocket events, and triggers notifications |
| Cost tracking | Per-turn token usage accumulated in ExecutionResponse.ai_metadata; budget enforcement via budget_policy in request |
Internal Components
Agent Runtime Engine
The central orchestrator that manages the lifecycle of a single agent execution. It receives the ExecutionRequest from the API endpoint, delegates to the Context Builder to assemble the LLM prompt, runs the iterative Reason, Act, Observe loop, coordinates with the Governance Checker before each tool execution, enforces turn limits and token budgets, delegates to the Response Formatter to produce the final ExecutionResponse, and emits per-step telemetry events for the calling Temporal activity.
Context Builder
Transforms raw execution context into a structured LLM prompt. It receives agent_config, user_context, data_source_metadata, tool_inventory, governance_policies, long_term_context, trigger_context, and conversation_history. It builds the system prompt by assembling role definition, goal and instructions, available tool schemas in function-calling format, data source metadata, governance rules, accumulated memory, and trigger context. It optimizes the prompt for the selected LLM provider's context window and compresses conversation history when it exceeds token thresholds using deterministic summarization (no LLM call).
Multi-LLM Router (RiverCore)
Selects the optimal LLM model and provider for each reasoning turn. It classifies each turn's task complexity, selects the best available model from configured providers, manages provider health monitoring and auto-failover, tracks per-turn token usage, and supports dynamic provider configuration without code changes.
Dispatches tool calls to the appropriate target. For reasoning tools, it executes locally within the service process. For execution tools, it constructs an HTTP request to TLO Gateway with proper headers (X-User-ID, X-Org-ID, X-Workspace-ID, X-Agent-ID, X-Internal-Call). For interaction tools, it returns an interaction payload for the Temporal workflow to forward via WebSocket. It validates tool arguments against Pydantic input schemas before dispatch and handles timeouts and retries for transient failures.
Structures the final execution output into a well-defined Pydantic model. It assembles the ExecutionResponse from accumulated turn results, extracts query results, action results, visualization specs, and execution plans from tool outputs, builds the AI metadata block (total turns, models used, token counts, cost estimate), generates follow-up suggestions based on detected intent, and handles partial results when execution is interrupted.
Governance Checker
Evaluates governance policies inline during execution. Before each execution tool call, it checks whether the action is permitted, requires approval, or is blocked based on the agent's action_level, approval_rules, and bound governance policies. It returns structured governance decisions (PROCEED, APPROVAL_REQUIRED, SUGGEST_ONLY, BLOCKED) with reason.
Agentic Loop
Overview
The river-agent service implements the ReAct (Reason + Act) pattern. Unlike a plan-then-execute model, the agent reasons and acts iteratively -- each step's result informs the next step. The loop runs entirely inside the service process for a single execution, from initial context assembly to final structured output.
Step 1: Context Assembly
The Context Builder assembles the full LLM context from all inputs received in the ExecutionRequest:
| Input | Source | Purpose |
|---|
agent_config | Backend (agent + version tables) | Goal, instructions, action level, model config, allowed tools, governance policies |
user_context | TLO Gateway (extracted from JWT) | user_id, org_id, workspace_id, roles, permissions (sanitized -- no raw JWT or credentials) |
input_prompt | User (for manual triggers) | Natural language instruction or question |
trigger_context | Trigger Ingestion Service | Trigger type, source, payload (webhook data, metric values, cron context) |
data_source_metadata | Backend + Schema Discovery | Connected data source IDs, names, types, table/column metadata |
governance_policies | Backend (policies table) | Bound policy conditions and enforcement rules |
conversation_history | Backend (execution logs from previous runs) | Long-term memory and accumulated learnings |
tool_inventory | Agent version config | Filtered list of tools the agent is allowed to use |
System prompt sections built by the Context Builder:
| Section | Content |
|---|
| Role definition | "You are {agent.name}, a {agent.business_function} agent for {agent.domain}..." |
| Goal and instructions | The agent's instruction_set -- natural language behavioral rules and objectives |
| Available tools | JSON schema definitions of allowed tools in function-calling format |
| Data context | Schema metadata for connected data sources (table names, column types, relationships) |
| Governance rules | "Your action level is {level}. You must {constraints}. The following policies apply: {policies}" |
| Long-term memory | "From previous runs, you have learned: {accumulated_learnings}" |
| Trigger context | "This run was triggered by: {trigger_type} from {trigger_source}. Payload: {trigger_payload}" |
Step 2: Reasoning
Each reasoning turn sends the assembled system prompt, conversation history, and tool declarations to the LLM via RiverCore. All LLM responses are parsed as either tool calls (function-calling format) or text content. Tool call arguments are validated against Pydantic input schemas before execution. The LLM never produces free-form text that requires regex parsing -- all structured data flows through Pydantic models.
Step 3: Action Selection
After the LLM responds, the Agent Runtime Engine inspects the response:
| LLM Response Type | Action Taken |
|---|
| Tool call (reasoning) | Execute locally, feed result back as observation, continue loop |
| Tool call (execution) | Check governance, then route through TLO Gateway |
| Tool call (interaction) | Return interaction payload to Temporal for WebSocket delivery, pause loop |
| Text content (final answer) | Treat as final answer, proceed to finalization |
| Empty response (0 tokens) | Treat as error, retry with model escalation |
Tool call routing by category:
| Category | Destination | Examples |
|---|
| Reasoning | Local (within service process) | classify_intent, generate_query, search_catalog |
| Execution | TLO Gateway (ACL-validated HTTP) | execute_query, write_back, create_data_source |
| Interaction | Temporal workflow (WebSocket via TLO) | ask_user |
Per-tool timeout is 30 seconds (configurable). Transient failures are retried once with exponential backoff (100 ms, 200 ms). Permanent failures (403, 404, 422) are reported immediately as failed observations.
Step 5: Observation
The tool result (success or failure) is fed back into the LLM's conversation context:
[Tool Result] Tool '{tool_name}' returned: {json_result}
The LLM then reasons about the result and decides the next action: call another tool, produce a final answer, or request user input.
Step 6: Finalization
| Terminal Condition | Handling |
|---|
| Final answer | LLM produces text content with no tool call. Response Formatter builds ExecutionResponse with status: success. |
| Interaction | LLM calls ask_user. Loop pauses. Temporal workflow handles WebSocket delivery and waits for user response. |
| Approval required | Governance Checker returns APPROVAL_REQUIRED. Loop pauses. Temporal workflow manages approval gate. |
| Turn limit reached | Max turns exceeded. Response Formatter builds ExecutionResponse with partial results and status: max_turns_exceeded. |
| Token budget exceeded | Cumulative token usage exceeds budget. status: budget_exceeded. |
| Unrecoverable error | Error after retry exhausted. status: failed with error details. |
Full Loop Sequence
Multi-LLM Routing
Model Categories
RiverCore organizes all models into four capability tiers:
| Tier | Selection Criteria | Example Models | Typical Latency |
|---|
| Fast | Intent classification, catalog search, simple formatting, status retrieval | GPT-4o-mini, Gemini Flash Lite, Claude Haiku 4.5, DeepSeek-V3 | 0.5-2s |
| Balanced | Governance checks, simple query generation, result explanation, data source management | GPT-4o, Gemini Flash, Claude Sonnet 4.6, DeepSeek-V3 | 1-4s |
| Reasoning | Complex multi-table joins, federated queries, error recovery, ambiguous prompts | o3, Gemini Pro, Claude Opus 4.7, DeepSeek-R1 | 3-15s |
| Coding | SQL/NoSQL generation, dialect-specific optimization, data transformation | GPT-4o, Claude Sonnet 4.6, DeepSeek-V3 | 1-5s |
Intent-to-Tier Mapping
After the initial classify_intent turn resolves the intent, subsequent turns use the intent-based tier:
| Intent | Tier | Max Turns | Expected Duration |
|---|
PLATFORM_EXPLANATION | Fast | 3 | 2-4s |
DATA_DISCOVERY | Fast | 4 | 2-4s |
DATA_QUERY | Coding | 6 | 4-8s |
DATA_SOURCE_MANAGEMENT | Balanced | 5 | 5-10s |
ACTION_EXECUTION | Coding | 6 | 5-12s |
GOVERNANCE_MANAGEMENT | Balanced | 4 | 4-8s |
WORKSPACE_ADMIN | Fast | 5 | 2-8s |
STORAGE_MANAGEMENT | Fast | 3 | 2-5s |
OBSERVABILITY | Balanced | 4 | 4-8s |
Model Selection
Per-turn priority order:
| Priority | Condition | Tier Selected |
|---|
| 1 | No intent classified yet (first turn) | Fast |
| 2 | Previous turn failed with error | Reasoning (escalation for error recovery) |
| 3 | Query mode active | Coding |
| 4 | Intent classified | Intent-based mapping (see table above) |
| 5 | Default fallback | Balanced |
Provider Configuration
| Setting | Description | Example |
|---|
provider_name | Provider identifier | gemini, openai, anthropic, deepseek |
api_key | Authentication credential | (from secrets management) |
base_url | API endpoint | https://generativelanguage.googleapis.com/v1beta |
models | Map of tier to model ID | {"fast": "gemini-2.0-flash-lite", "balanced": "gemini-2.0-flash"} |
priority | Provider selection priority (lower = preferred) | 1 |
max_retries | Retries per request | 2 |
timeout_seconds | Per-request timeout | 30 |
rate_limit_rpm | Requests per minute limit | 60 |
enabled | Whether this provider is active | true |
RiverCore tracks per-provider latency (P50, P90, P99) and error rate over rolling 5-minute windows. If error rate exceeds 30% or P90 latency exceeds 20 seconds, the provider is marked as degraded and deprioritized in selection. If all providers in a tier are degraded, RiverCore selects the least-degraded option.
Token Budget Management
| Threshold | Behavior |
|---|
| 80% of budget consumed | Turn limit is reduced to force finalization |
| 100% of budget exceeded | Execution terminates with status: budget_exceeded |
| Default budget | 100,000 tokens per execution |
API Endpoints
| Method | Endpoint | Description |
|---|
| POST | /api/v1/execute | Run a full agent execution from initial context |
| POST | /api/v1/execute/continue | Resume an execution paused for interaction or approval |
| POST | /api/v1/generate-agent | Generate agent configuration from a natural language description |
| GET | /health | Liveness check |
| GET | /health/ready | Readiness check (verifies TLO Gateway and LLM provider reachability) |
The complete request body for POST /api/v1/execute.
Top-Level Fields
| Field | Type | Required | Description |
|---|
execution_id | int | Yes | Unique identifier for this execution (from agent_executions table) |
agent_config | object | Yes | Agent configuration snapshot; see below |
user_context | object | Yes | Sanitized user identity and permissions |
input_prompt | string | No | Natural language input (for manual triggers; null for automated triggers) |
trigger_context | object | Yes | Trigger type, source, and payload |
data_source_metadata | array | Yes | Connected data sources with schema metadata |
conversation_history | array | Yes | Long-term memory from previous executions of this agent |
agent_config Object
| Field | Type | Required | Description |
|---|
agent_id | uuid | Yes | The agent's unique identifier |
name | string | Yes | Human-readable agent name |
business_function | enum | Yes | customer_support, sales, finance, risk_compliance, data_analyst, operations, executive, custom |
domain | string | Yes | Business domain tag |
goal | string | Yes | High-level objective (natural language) |
instructions | string | Yes | Detailed behavioral instructions (the instruction_set) |
system_prompt | string | No | Optional custom system prompt override |
action_level | enum | Yes | read_only, recommend, act_with_approval, automated |
governance_level | enum | Yes | standard, strict, custom |
model_config | object | Yes | Model selection parameters; see below |
tools | array | Yes | Tool names this agent is allowed to invoke |
governance_policies | array | Yes | Bound policy conditions and enforcement rules |
approval_rules | object | Yes | Approval gate configuration |
notification_config | json | No | Alert channels and trigger conditions |
model_config Object
| Field | Type | Required | Description |
|---|
preferred_tier | enum | No | Override default tier selection |
max_turns | int | Yes | Turn limit for this execution (default: 15) |
token_budget | int | Yes | Maximum total tokens (default: 100,000) |
timeout_seconds | int | Yes | Per-turn timeout (default: 30) |
approval_rules Object
| Field | Type | Required | Description |
|---|
require_approval_for | array | Yes | Tool names that require human approval |
auto_approve_conditions | array | No | Per-tool conditions for automatic approval |
approver_roles | array | Yes | Roles authorized to approve (e.g., ["admin", "editor"]) |
escalation_timeout_minutes | int | Yes | Time before approval request expires (default: 1440) |
user_context Object
| Field | Type | Required | Description |
|---|
user_id | int | Yes | Numeric user ID (from JWT, sanitized) |
org_id | int | Yes | Organization ID (tenant isolation) |
workspace_id | int | Yes | Workspace scope |
roles | array | Yes | User's roles (e.g., ["org_editor", "ws_analyst"]) |
permissions | array | Yes | User's permissions (e.g., ["data_source:view", "data_source:query"]) |
attributes | json | No | Additional user attributes (e.g., {"assigned_region": "West"}) |
trigger_context Object
| Field | Type | Required | Description |
|---|
trigger_type | enum | Yes | manual, scheduled, event, api, threshold, workflow |
trigger_source | string | Yes | Origin identifier (e.g., "cron:daily-8am", "webhook:zendesk") |
trigger_payload | json | No | Data from the trigger event |
triggered_by | uuid | No | User ID (manual/API triggers) or system ID (automated) |
triggered_at | timestamp | Yes | Trigger timestamp (ISO 8601) |
conversation_history Entry
Each entry in conversation_history represents long-term memory from a prior execution of this agent:
{
"execution_id": 1024,
"completed_at": "2026-04-22T09:14:00Z",
"summary": "Processed 42 overdue tickets. Escalated 3 to Tier 2.",
"learnings": [
"High-priority tickets from accounts over $10k should be escalated immediately.",
"Draft response template 'refund_policy_v2' works best for billing disputes."
],
"flagged_items": [
"Account ACCT-8821 has unusual refund request frequency."
]
}
Full ExecutionRequest Example
{
"execution_id": 9871,
"agent_config": {
"agent_id": "a7f3b2d4-1e5c-4f8a-9b6d-0c2e7f3a1d8b",
"name": "L1 Support Specialist",
"business_function": "customer_support",
"domain": "Customer Success",
"goal": "Resolve Tier 1 support tickets automatically using internal knowledge base and CRM data.",
"instructions": "You are a support agent. Check ticket priority first. For billing disputes, verify the charge in Stripe before responding. Always draft a response before sending.",
"action_level": "act_with_approval",
"governance_level": "standard",
"model_config": {
"max_turns": 15,
"token_budget": 100000,
"timeout_seconds": 30
},
"tools": [
"classify_intent", "check_governance", "generate_query",
"search_catalog", "recommend_visualization", "explain_results",
"execute_query", "write_back", "ask_user"
],
"governance_policies": [
{
"policy_id": "f9a2c4d6-8b1e-4f7a-a3c5-2d4e6f8b1a3c",
"name": "PII Export Limit",
"scope": "workspace",
"conditions": {"rows_gt": 10000, "table_classification": "pii"},
"enforcement": "require_approval"
}
],
"approval_rules": {
"require_approval_for": ["write_back"],
"auto_approve_conditions": [
{"tool": "write_back", "condition": "confidence_score > 0.95"}
],
"approver_roles": ["admin", "editor"],
"escalation_timeout_minutes": 1440
}
},
"user_context": {
"user_id": 4421,
"org_id": 12,
"workspace_id": 37,
"roles": ["org_editor", "ws_analyst"],
"permissions": ["data_source:view", "data_source:query", "agent:execute"]
},
"input_prompt": "Process all open high-priority tickets in Zendesk created in the last 24 hours.",
"trigger_context": {
"trigger_type": "manual",
"trigger_source": "ui",
"triggered_by": "4421",
"triggered_at": "2026-05-10T08:00:00Z"
},
"data_source_metadata": [
{
"data_source_id": 14,
"name": "Zendesk Production",
"type": "zendesk",
"status": "connected",
"access_level": "read_write",
"schemas": [
{
"table_name": "tickets",
"columns": [
{"column_name": "id", "data_type": "integer", "is_nullable": false, "description": "Ticket ID"},
{"column_name": "priority", "data_type": "string", "is_nullable": false, "description": "low, normal, high, urgent"},
{"column_name": "status", "data_type": "string", "is_nullable": false, "description": "open, pending, solved, closed"},
{"column_name": "created_at", "data_type": "timestamp", "is_nullable": false, "description": "Ticket creation timestamp"}
]
}
]
}
],
"conversation_history": [
{
"execution_id": 9840,
"completed_at": "2026-05-09T08:04:22Z",
"summary": "Processed 18 tickets. 14 resolved autonomously. 4 escalated.",
"learnings": ["Tickets with tag 'billing' require Stripe charge verification before responding."],
"flagged_items": []
}
]
}
Output Contract
The complete response schema for POST /api/v1/execute.
Top-Level Fields
| Field | Type | Always Present | Description |
|---|
execution_id | int | Yes | Correlation ID matching the request |
status | enum | Yes | success, failed, awaiting_approval, awaiting_interaction, max_turns_exceeded, budget_exceeded, timeout |
result | object | Yes | Execution outcome (summary, actions taken, recommendations, artifacts) |
steps | array | Yes | Complete step-by-step trace of the execution |
approval_request | object | If status: awaiting_approval | Approval payload for the Temporal workflow |
interaction_request | object | If status: awaiting_interaction | Interaction payload for WebSocket delivery |
usage | object | Yes | Token usage, cost estimate, model breakdown |
error | object | If status: failed | Error code, message, recoverability |
result Object
| Field | Type | Description |
|---|
summary | string | Natural language summary of what the agent accomplished |
actions_taken | array | Tool calls executed with their results (tool_name, arguments, result_summary, status) |
recommendations | array | Suggested follow-up actions |
output_artifacts | array | Generated reports, query results, or exported data references |
steps Entry
| Field | Type | Present When | Description |
|---|
step_number | int | Always | Step index |
step_type | enum | Always | reasoning, tool_call, observation, interaction, governance_check, error, final_answer |
tool_name | string | Tool call steps | Tool name dispatched |
tool_category | string | Tool call steps | reasoning, execution, or interaction |
input | json | Tool call steps | Tool arguments or LLM input summary |
output | json | Tool call steps | Tool result or LLM output summary |
model_used | string | Reasoning steps | LLM model ID (e.g., "claude-sonnet-4-6") |
model_tier | string | Reasoning steps | fast, balanced, reasoning, coding |
provider | string | Reasoning steps | LLM provider name |
tokens | object | Reasoning steps | {"input": int, "output": int} |
duration_ms | int | Always | Step duration in milliseconds |
status | string | Always | completed, failed, blocked, pending |
governance_decision | string | Governance check steps | PROCEED, APPROVAL_REQUIRED, BLOCKED, SUGGEST_ONLY |
error | string | Error steps | Error message |
approval_request Object
Present only when status: awaiting_approval.
{
"tool_name": "write_back",
"proposed_payload": {
"data_source_id": 14,
"table_name": "tickets",
"operation": "update",
"data": {"status": "solved", "comment": "Resolved via automated policy lookup."},
"conditions": {"id": 98821}
},
"reasoning_summary": "Ticket #98821 matches the standard billing dispute resolution policy. The charge was verified in Stripe at $49.99, within the 30-day refund window.",
"risk_context": "Write operation on tickets table. Requires workspace editor approval per governance policy.",
"confidence_score": 0.94,
"auto_approve_eligible": false
}
interaction_request Object
Present only when status: awaiting_interaction.
{
"interaction_type": "clarification_request",
"message": "I found 3 high-priority tickets matching your criteria. Should I process all 3 autonomously, or would you like to review each response before I send it?",
"options": ["Process all autonomously", "Review each response first"],
"required": true
}
usage Object
| Field | Type | Description |
|---|
total_turns | int | Total reasoning turns executed |
total_tokens | int | Total tokens consumed across all turns |
cost_estimate | float | Estimated cost in USD |
models_used | object | Per-model breakdown: provider, tier, input_tokens, output_tokens, turns, estimated_cost |
execution_duration_ms | int | Total execution time in milliseconds |
error Object
Present only when status: failed.
| Field | Type | Description |
|---|
code | string | AGENT_ERROR, TIMEOUT, INVALID_TOOL, VALIDATION_ERROR, LLM_ERROR, GOVERNANCE_BLOCKED, TURN_LIMIT_EXCEEDED, BUDGET_EXCEEDED, PROVIDER_UNAVAILABLE |
message | string | Human-readable error message |
recoverable | bool | Whether the error is transient and the execution can be retried |
details | json | Additional error context |
Continuation Contract
Sent to POST /api/v1/execute/continue when the Temporal workflow resumes after an interaction or approval resolution.
| Field | Type | Required | Description |
|---|
execution_id | int | Yes | Execution to resume |
continuation_type | enum | Yes | interaction_response, approval_resolved |
interaction_response | object | If continuation_type: interaction_response | {"user_response": json} |
approval_resolution | object | If continuation_type: approval_resolved | See below |
serialized_state | json | Yes | Full AgentContext state serialized when the loop paused (conversation history, turn results, tool results, detected intent, governance evaluation, timing metadata) |
approval_resolution Object
| Field | Type | Required | Description |
|---|
status | enum | Yes | approved, rejected, edited_approved |
modified_args | json | If edited_approved | Modified tool arguments |
resolved_by | uuid | Yes | ID of the approver |
resolution_comment | string | No | Approver comment |
AI Generation Contract
POST /api/v1/generate-agent Request
Generates a complete agent configuration from a natural language description.
| Field | Type | Required | Description |
|---|
prompt | string | Yes | Natural language description of the desired agent |
user_context | object | Yes | user_id, org_id, workspace_id, roles, permissions |
available_data_sources | array | Yes | Data sources the workspace has access to |
available_tools | array | Yes | All tools the user's workspace can use |
workspace_templates | array | No | Available agent templates for reference |
{
"prompt": "Create an agent that monitors Zendesk tickets and drafts responses for high-priority issues",
"user_context": {
"user_id": 4421,
"org_id": 12,
"workspace_id": 37,
"roles": ["org_editor"],
"permissions": ["agent:create", "data_source:view"]
},
"available_data_sources": [
{"data_source_id": 14, "name": "Zendesk Production", "type": "zendesk", "description": "Primary customer support desk"}
],
"available_tools": ["classify_intent", "execute_query", "write_back", "ask_user"],
"workspace_templates": []
}
POST /api/v1/generate-agent Response
| Field | Type | Description |
|---|
agent_config | object | Complete suggested agent configuration (name, description, category, goal, instructions, action_level, suggested triggers, data sources, tools, policies) |
reasoning | string | Full chain-of-thought explaining the AI's configuration choices |
confidence_score | float | 0.0 to 1.0 confidence in the generated configuration |
warnings | array | Concerns about the requested configuration |
{
"agent_config": {
"name": "Zendesk Tier 1 Support Agent",
"description": "Monitors incoming Zendesk tickets and drafts structured responses for high-priority issues using CRM and knowledge base data.",
"category": "customer_support",
"goal": "Reduce first-response time on high-priority Zendesk tickets by automatically drafting and routing responses.",
"instructions": "Check ticket priority on each run. For tickets with priority 'high' or 'urgent' created in the last 24 hours, query the CRM for account details and draft a response using the approved response templates. Always request approval before sending.",
"action_level": "act_with_approval",
"suggested_triggers": [
{
"trigger_type": "scheduled",
"trigger_config": {"cron": "*/15 * * * *"},
"reasoning": "15-minute polling matches typical SLA requirements for high-priority tickets."
}
],
"suggested_data_sources": [
{
"data_source_id": 14,
"name": "Zendesk Production",
"access_level": "read_write",
"reasoning": "Required to read ticket data and post responses."
}
],
"suggested_tools": [
{"tool_name": "execute_query", "reasoning": "Retrieve ticket and CRM data."},
{"tool_name": "write_back", "reasoning": "Post drafted responses to Zendesk."}
],
"suggested_policies": [
{
"name": "Require Approval Before Sending",
"conditions": {"tool": "write_back"},
"enforcement": "require_approval",
"reasoning": "Customer-facing messages should be reviewed before delivery."
}
]
},
"reasoning": "The user wants a reactive agent triggered on ticket creation. Act with Approval is appropriate because responses are customer-facing and carry reputational risk. A 15-minute schedule balances responsiveness with processing cost.",
"confidence_score": 0.91,
"warnings": [
"Ensure the Zendesk Production data source has write access enabled before deployment."
]
}
The 17 tools available to River Agents divide into three categories. Reasoning tools execute locally within the river-agent process. Execution tools route through TLO Gateway for ACL validation. The interaction tool returns a payload to the Temporal workflow for WebSocket delivery.
These tools execute inside the river-agent process and do not require TLO ACL validation.
classify_intent
| Property | Value |
|---|
| Category | Reasoning |
| Purpose | Classify the natural language prompt into a structured intent type |
| Target | Local (LLM call via RiverCore, Fast tier) |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
user_prompt | string | Yes | The user's natural language input |
available_intents | array | No | Intent types to consider (defaults to all active intents) |
Output:
| Field | Type | Description |
|---|
intent_type | string | Classified intent (e.g., DATA_QUERY, DATA_SOURCE_MANAGEMENT) |
confidence | float | Classification confidence (0.0 to 1.0) |
sub_intents | array | Decomposed sub-intents for compound prompts |
entities | json | Extracted entities (table names, column names, dates, values) |
check_governance
| Property | Value |
|---|
| Category | Reasoning |
| Purpose | Evaluate RBAC, RLS, masking, budget, and action-level permissions |
| Target | Local (policy evaluation) |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
intent_type | string | Yes | The classified intent |
data_source_ids | array | No | Data sources to check access for |
requested_operation | string | Yes | Operation type (e.g., "query", "write", "delete") |
Output:
| Field | Type | Description |
|---|
allowed | bool | Whether the operation is permitted |
reason | string | Explanation of the decision |
constraints | json | Applied constraints (RLS filters, masking rules, row limits) |
warnings | array | Non-blocking warnings |
generate_query
| Property | Value |
|---|
| Category | Reasoning |
| Purpose | Generate SQL/NoSQL query from natural language with governance filters applied |
| Target | Local (LLM call via RiverCore, Coding tier) |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
user_prompt | string | Yes | The user's natural language query |
intent_type | string | Yes | Classified intent |
data_sources | array | Yes | Data source metadata with schemas |
table_schemas | array | No | Specific table schemas to use |
governance | json | No | Governance constraints to embed in the query |
Output:
| Field | Type | Description |
|---|
query | string | The generated SQL/NoSQL query |
dialect | string | Query dialect (e.g., "postgresql", "mongodb") |
explanation | string | Explanation of the query logic |
operations | array | Structured operation plan (for multi-step queries) |
plan_type | string | "query", "mutation", or "federated" |
data_source_ids | array | Data sources referenced |
search_catalog
| Property | Value |
|---|
| Category | Reasoning |
| Purpose | Find relevant tables and columns via vector similarity search |
| Target | Qdrant (via internal client) |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
query | string | Yes | Natural language search query |
data_source_ids | array | No | Limit search to specific data sources |
top_k | int | No | Number of results to return (default: 10) |
Output:
| Field | Type | Description |
|---|
matches | array | Matching catalog entries with similarity scores |
total_results | int | Total number of matches |
recommend_visualization
| Property | Value |
|---|
| Category | Reasoning |
| Purpose | Suggest charts and display formats for query results |
| Target | Local (LLM call via RiverCore, Fast tier) |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
columns | array | Yes | Result column names |
column_types | array | Yes | Column data types |
row_count | int | Yes | Number of result rows |
user_prompt | string | No | Original query for context |
Output:
| Field | Type | Description |
|---|
recommended_type | string | Chart type (e.g., "bar", "line", "pie", "table") |
x_axis | string | Suggested X axis column |
y_axis | string | Suggested Y axis column |
title | string | Suggested chart title |
config | json | Additional visualization configuration |
explain_results
| Property | Value |
|---|
| Category | Reasoning |
| Purpose | Score confidence and provide human-readable explanation of results |
| Target | Local (LLM call via RiverCore, Balanced tier) |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
user_prompt | string | Yes | Original user query |
query_result_columns | array | Yes | Column names from the result |
query_result_sample | array | Yes | Sample rows from the result |
generated_query | string | No | The SQL/NoSQL query that produced the result |
Output:
| Field | Type | Description |
|---|
explanation | string | Natural language explanation of the results |
confidence | float | Confidence score (0.0 to 1.0) |
suggestions | array | Follow-up query suggestions |
data_quality_notes | array | Data quality observations |
These tools route through TLO Gateway for ACL validation. The river-agent service constructs the HTTP request with proper auth headers and receives the downstream service response.
create_data_source
| Property | Value |
|---|
| Category | Execution |
| Required Permission | data_source:create |
| Target Service | Backend :8005 |
| Target Endpoint | POST /api/v1/data-sources |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
name | string | Yes | Data source display name |
type | string | Yes | Connector type (e.g., "postgresql", "mongodb") |
connection_config | json | Yes | Connection parameters (host, port, database) |
credentials | json | Yes | Authentication credentials |
description | string | No | Human-readable description |
Output:
| Field | Type | Description |
|---|
success | bool | Whether the operation succeeded |
data_source_id | int | ID of the created data source |
message | string | Status message |
update_data_source
| Property | Value |
|---|
| Category | Execution |
| Required Permission | data_source:update |
| Target Service | Backend :8005 |
| Target Endpoint | PATCH /api/v1/data-sources/\{id\} |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
data_source_id | int | Yes | ID of the data source to update |
updates | json | Yes | Fields to update |
Output: {"success": bool, "message": string}
delete_data_source
| Property | Value |
|---|
| Category | Execution |
| Required Permission | data_source:delete |
| Target Service | Backend :8005 |
| Target Endpoint | DELETE /api/v1/data-sources/\{id\} |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
data_source_id | int | Yes | ID of the data source to delete |
Output: {"success": bool, "message": string}
test_connection
| Property | Value |
|---|
| Category | Execution |
| Required Permission | data_source:view |
| Target Service | Backend :8005 |
| Target Endpoint | POST /api/v1/data-sources/\{id\}/test |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
data_source_id | int | Yes | ID of the data source to test |
Output:
| Field | Type | Description |
|---|
success | bool | Whether the connection succeeded |
latency_ms | int | Connection latency |
message | string | Status message |
discover_schema
| Property | Value |
|---|
| Category | Execution |
| Required Permission | data_source:view |
| Target Service | Backend :8005 |
| Target Endpoint | POST /api/v1/data-sources/\{id\}/discover |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
data_source_id | int | Yes | ID of the data source |
Output:
| Field | Type | Description |
|---|
success | bool | Whether discovery succeeded |
tables | array | Discovered tables with column metadata |
total_tables | int | Total number of tables discovered |
execute_query
| Property | Value |
|---|
| Category | Execution |
| Required Permission | data_source:query |
| Target Service | Data Orchestration :8002 |
| Target Endpoint | POST /api/v1/query/execute |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
data_source_id | int | Yes | Target data source |
query | string | Yes | SQL/NoSQL query to execute |
max_rows | int | No | Maximum rows to return (default: 1000) |
parameters | json | No | Query parameters for parameterized queries |
Output:
| Field | Type | Description |
|---|
columns | array | Column names |
rows | array | Result rows |
total_rows | int | Total rows returned |
execution_time_ms | int | Query execution time |
apply_governance_policy
| Property | Value |
|---|
| Category | Execution |
| Required Permission | policy:create |
| Target Service | Backend :8005 |
| Target Endpoint | POST /api/v1/policies |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
name | string | Yes | Policy name |
description | string | No | Policy description |
scope | string | Yes | "organization" or "workspace" |
conditions | json | Yes | Policy condition expressions |
enforcement | string | Yes | "block", "require_approval", "warn", or "log_only" |
Output:
| Field | Type | Description |
|---|
success | bool | Whether the policy was created |
policy_id | uuid | ID of the created policy |
message | string | Status message |
write_back
| Property | Value |
|---|
| Category | Execution |
| Required Permission | data_source:update + confirmation |
| Target Service | Data Orchestration :8002 |
| Target Endpoint | POST /api/v1/data/write-back |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
data_source_id | int | Yes | Target data source |
table_name | string | Yes | Target table |
operation | string | Yes | "insert", "update", or "delete" |
data | json | Yes | Data to write |
conditions | json | No | WHERE conditions (for update/delete) |
Output:
| Field | Type | Description |
|---|
success | bool | Whether the write succeeded |
rows_affected | int | Number of rows modified |
message | string | Status message |
get_workspace_info
| Property | Value |
|---|
| Category | Execution |
| Required Permission | workspace:view |
| Target Service | Backend :8005 |
| Target Endpoint | GET /api/v1/workspaces/\{id\} |
| Timeout | 30s |
Input:
| Field | Type | Required | Description |
|---|
workspace_id | int | No | Workspace ID (defaults to current workspace from user context) |
Output:
| Field | Type | Description |
|---|
name | string | Workspace name |
description | string | Workspace description |
status | string | Workspace status |
total_members | int | Number of members |
settings | json | Workspace configuration |
get_storage_info
| Property | Value |
|---|
| Category | Execution |
| Required Permission | storage:view |
| Target Service | Storage Service :8003 |
| Target Endpoint | GET /api/v1/storage/usage |
| Timeout | 30s |
Input: Uses workspace context from user_context. No additional input fields.
Output:
| Field | Type | Description |
|---|
total_size_bytes | int | Total storage used |
quota_bytes | int | Storage quota |
usage_percentage | float | Usage as percentage |
file_count | int | Number of stored files |
remaining_bytes | int | Available storage |
ask_user
| Property | Value |
|---|
| Category | Interaction |
| Required Permission | None |
| Target | WebSocket via TLO Gateway (delivered by Temporal workflow) |
| Timeout | N/A (waits for user response) |
Input:
| Field | Type | Required | Description |
|---|
interaction_type | string | Yes | "clarification_request", "confirmation_request", "parameter_request", "credential_request" |
message | string | Yes | The question or prompt for the user |
options | array | No | Selectable options (for multiple-choice) |
required | bool | No | Whether a response is mandatory (default: true) |
Behavior: ask_user is intercepted by the Tool Executor before dispatch. Calling it causes the service to return status: awaiting_interaction immediately, with the full interaction payload embedded in the output. The Temporal workflow handles WebSocket delivery and waits for user response before re-invoking via POST /api/v1/execute/continue.
Error Handling and Recovery
Error Categories
| Error Type | Retryable | river-agent Response |
|---|
| Tool timeout (exceeds 30s) | Yes (max 2 retries) | Observation with status: timeout |
| Transient failure (503, network) | Yes (max 2 retries, exponential backoff) | Observation with error details |
| ACL denial (403 from TLO Gateway) | No | status: failed, code: GOVERNANCE_BLOCKED |
| Not found (404) | No | Observation with error message |
| Validation error (422) | No | Observation with field-level error details |
| LLM provider timeout | Yes (auto-failover to next provider) | Retry via RiverCore fallback chain |
| All providers unavailable | No | status: failed, code: PROVIDER_UNAVAILABLE |
| Prompt injection detected in trigger payload | No (sanitize and continue) | Log security event; inject sanitized payload; continue execution |
Loop Guard Rails
| Condition | Behavior |
|---|
| Single tool error | Feed error observation to LLM; LLM decides next action |
| 2+ consecutive errors (same tool type) | Escalate model to Reasoning tier for error recovery |
| 3+ consecutive errors (interaction tools) | Allow up to 3 retries -- LLM often needs multiple replanning attempts after guardrail rejections |
| Same tool called 3+ times with identical arguments | Inject system notice: "You are in a loop. Stop calling this tool and proceed to the next step or provide a final answer." |
| No tool calls for 3 consecutive turns | Inject: "You appear to be repeating yourself. Please take action or conclude." |
| Empty LLM response (0 completion tokens) | Treat as error; retry with model escalation to Reasoning tier |
Execution-Level Errors
| Condition | Behavior |
|---|
| Turn limit reached | Force finalization with partial results; status: max_turns_exceeded |
| Token budget exceeded | Force finalization with partial results; status: budget_exceeded |
| Execution timeout | Terminate execution; status: timeout |
| Governance policy violation | Block action, log violation, feed "action blocked" observation to LLM |
| Approval timeout | Execution remains awaiting_approval until Temporal workflow handles expiration |
Governance Integration
Before every execution tool call, the Governance Checker evaluates in order:
Action-Level Decision Matrix
| Agent Action Level | Read Tool | Write Tool (in approval_rules) | Write Tool (not in approval_rules) |
|---|
read_only | PROCEED | BLOCKED | BLOCKED |
recommend | SUGGEST_ONLY | SUGGEST_ONLY | SUGGEST_ONLY |
act_with_approval | PROCEED | APPROVAL_REQUIRED | PROCEED |
automated | PROCEED | PROCEED | PROCEED |
Policy Types Evaluated
| Policy Type | Evaluation Point | Example Condition | Enforcement Options |
|---|
| Budget Policy | Before LLM call and before tool execution | cost.tokens > 100000 | warn, block |
| Rate Limit Policy | Before tool execution | agent.tool_calls_this_hour > 100 | block, log_only |
| Data Export Limit | After query execution, before returning results | rows > 10000 AND table.classification == 'pii' | block, require_approval |
| Time Window Policy | Before write tool execution | time.hour >= 18 OR time.hour < 6 | block, require_approval |
| Tool Restriction Policy | Before tool dispatch | tool.name IN ['delete_data_source', 'write_back'] | block |
| Content Policy | After LLM generates tool arguments | contains_pii(args.message) == true | require_approval, block |
Approval Gate Flow
When the Governance Checker returns APPROVAL_REQUIRED:
- The river-agent service constructs an
approval_request payload with tool_name, proposed_payload, reasoning_summary, risk_context, and confidence_score.
- The service returns
ExecutionResponse with status: awaiting_approval and the approval_request embedded.
- The Temporal workflow serializes the full
AgentContext into the context_snapshot field of the ApprovalRequest record created by the backend.
- The Temporal
river_agent_execution_workflow hibernates via workflow.wait_condition(), consuming no resources while waiting.
- On resolution (approved/rejected/edited), the backend sends a
approval_resolution Temporal signal.
- The workflow resumes and re-invokes river-agent via
POST /api/v1/execute/continue with the serialized state and resolution outcome.
- river-agent restores context, feeds the resolution as an observation, and continues the loop.
Governance Token Lifetime
The governance_token is issued by the Governance Service when the backend initiates the Temporal workflow. It is included in every turn invocation by the Temporal activity. If the token expires during a long-running execution, the backend re-issues a fresh token before the next invocation. The river-agent service never requests its own governance token.
Observability and Telemetry
OpenTelemetry Spans
| Span | Attributes | Notes |
|---|
river_agent.execution | execution_id, agent_id, action_level | Root span per POST /api/v1/execute |
river_agent.turn | execution_id, turn_number, model_used, model_tier | Child span per reasoning turn |
river_agent.reasoning | provider, tokens_in, tokens_out, latency_ms | Child span of turn |
river_agent.governance_check | tool_name, decision, policy_matched | Child span per governance evaluation |
river_agent.tool_dispatch | tool_name, category, status, latency_ms | Child span per tool call |
river_agent.tool_retry | tool_name, attempt, error_code | Child span if retry occurs |
Prometheus Metrics
| Metric | Type | Labels |
|---|
river_agent_executions_total | Counter | status, action_level, agent_id |
river_agent_turns_total | Counter | status, model_tier, agent_id |
river_agent_tool_calls_total | Counter | tool_name, category, status |
river_agent_tool_latency_ms | Histogram | tool_name, category |
river_agent_llm_latency_ms | Histogram | tier, provider |
river_agent_tokens_consumed_total | Counter | tier, agent_id, provider |
river_agent_governance_decisions_total | Counter | decision, agent_id, policy_type |
river_agent_cost_estimate_usd_total | Counter | agent_id, provider |
Each turn emits a structured JSON log at INFO level, forwarded to the backend and written to agent_logs:
{
"level": "INFO",
"service": "river-agent",
"execution_id": 9871,
"turn_number": 2,
"model_used": "claude-sonnet-4-6",
"model_tier": "balanced",
"provider": "anthropic",
"tool_called": "execute_query",
"tool_category": "execution",
"tool_status": "success",
"governance_decision": "PROCEED",
"tokens_in": 2140,
"tokens_out": 145,
"latency_ms": 1380,
"timestamp": "2026-05-10T08:04:22.421Z"
}
Deployment Configuration
Environment Variables
| Variable | Required | Default | Description |
|---|
TLO_GATEWAY_URL | Yes | -- | TLO Gateway URL for all tool dispatch and health readiness checks |
RIVERCORE_CONFIG_PATH | Yes | -- | Path to the provider configuration file (YAML) |
MAX_EXECUTION_CONCURRENCY | No | 10 | Maximum concurrent agentic loops per replica |
DEFAULT_MAX_TURNS | No | 15 | Default turn limit when not specified in agent_config |
DEFAULT_TOKEN_BUDGET | No | 100000 | Default token budget per execution |
DEFAULT_LLM_TIMEOUT_SECONDS | No | 30 | LLM inference timeout when not specified in agent_config |
DEFAULT_TOOL_TIMEOUT_SECONDS | No | 30 | Per-tool call timeout |
PROMPT_INJECTION_DETECTION_ENABLED | No | true | Enable trigger payload injection screening |
LOG_LEVEL | No | INFO | Structured log verbosity |
OTEL_EXPORTER_OTLP_ENDPOINT | No | -- | OpenTelemetry collector endpoint |
METRICS_PORT | No | 9090 | Prometheus metrics scrape port |
Resource Requirements
| Resource | Minimum | Recommended |
|---|
| CPU | 2 vCPU | 4 vCPU |
| Memory | 2 GB | 4 GB |
| Replicas | 2 (HA minimum) | Auto-scale 2-10 based on river_agent_executions_total rate |
Health Check Endpoints
| Endpoint | Status Codes | Purpose |
|---|
GET /health | 200 | Liveness check -- returns 200 OK if the process is running |
GET /health/ready | 200, 503 | Readiness check -- returns 200 OK only when TLO Gateway is reachable and at least one LLM provider is available |
mTLS Configuration
Communication between TLO Gateway (:8001) and river-agent (:8007) uses mutual TLS. TLO Gateway presents a client certificate issued by the internal CA. river-agent validates the certificate before accepting any request. This ensures only TLO Gateway can call river-agent, even within the internal network, providing an additional layer beyond network-level IP allowlisting.
Scaling Behavior
river-agent is fully stateless. Any replica handles any request. Horizontal scaling is safe at any replica count. Because the agentic loop runs entirely within a single invocation (from ExecutionRequest to ExecutionResponse), load balancing across replicas does not require session affinity. The Temporal activity that calls river-agent handles retry logic at the workflow level -- if a replica fails mid-execution, the Temporal activity retries the HTTP call, and a new replica picks up the full ExecutionRequest from the beginning of the current loop segment.
- TLO Gateway -- Entry point that authenticates requests, enforces ACL, and routes execution tool calls from river-agent to downstream services
- Backend API -- Owns all agent lifecycle state, logs, approvals, and triggers that river-agent reads and writes through tool calls
- Governance and Safety -- Defines the policy conditions, action levels, and audit framework that the Governance Checker enforces inline during execution