AI Tools and Technologies
Page Outline
- Overview
- Use Cases
- RiverCore: Multi-Provider Control
- RiverPlan: Execution Planning
- RiverGuard: Governance Intelligence
- RiverSemantic: Catalog Intelligence
- RiverDecide: Decision Engine
- RiverOptimize: Performance Strategy
- RiverViz: Visualization and Rendering
- RiverLearn: Adaptive Learning
- RiverObserve: Operational Monitoring
- Infrastructure Specifications
- Limitations and Constraints
- Related
This document provides a comprehensive technical reference for the RiverGen Intelligence Stack, including detailed sub-service inventories and provider-routing specifications.
Overview
The RiverGen Intelligence layer is a modular architecture composed of domain-specific agents and stateless AI services. This separation allows for high-performance reasoning while maintaining a provider-agnostic infrastructure.
The stack follows the primary engineering rule: Agents coordinate reasoning workflows, AI Services provide specialized intelligence, and Compute layers execute the final physical operations.

Use Cases
The tools within the Intelligence Stack are designed to support a broad range of automated data engineering and analytics tasks.
- Cross-Source Federation: Planning and executing join operations across SQL, NoSQL, and Cloud Data Warehouses.
- Resilient AI Execution: Implementing multi-provider fallback chains to ensure execution availability during LLM outages.
- Automated Cataloging: Using vector-based semantic search to discover and map technical assets to business terminology.
- Governed Reasoning: Ensuring that every turn in an agentic loop is subject to policy-aware validation and safety gates.
RiverCore: Multi-Provider Control
RiverCore is the foundational service that manages all Large Language Model (LLM) provider interactions. It abstracts the underlying complexity of specific SDKs into a unified platform interface.

Model Capability Categories
To optimize performance and cost, RiverCore classifies models into four capability categories. The system selects the best available model for each specific turn in an agentic loop.
- Fast: Quick classification and pattern matching (e.g., Gemini Flash Lite, GPT-4o-mini).
- Balanced: Standard query generation and structured reasoning (e.g., Gemini Flash, Claude Sonnet).
- Reasoning: Complex joins, federation, and error recovery (e.g., Gemini Pro, o3, Claude Opus).
- Coding: Specialized SQL/NoSQL generation and dialect-specific optimization (e.g., GPT-4o, DeepSeek-V3).
RiverCore Sub-Service Inventory
| Sub-Service | Description | Status |
|---|---|---|
| Provider Registry | Manages LLM provider registration and healthy checks. | [ACTIVE] |
| Complexity Router | Performs per-turn model selection based on task tier. | [ACTIVE] |
| Tool Call Adapter | Normalizes function-calling formats across providers. | [ACTIVE] |
| Output Normalizer | Ensures canonical JSON responses regardless of model. | [ACTIVE] |
| Fallback Controller | Manages automatic provider failover within a category. | [ACTIVE] |
| Cost Metering | Tracks token usage and operational costs per tenant. | [ACTIVE] |
| Latency Telemetry | Monitors end-to-end response times for AI calls. | [ACTIVE] |
| Policy Model Selector | Selects models based on sensitivity and tenant rules. | [PHASE 2] |
| Prompt Library | Centralized management of specialized prompt templates. | [PLANNED] |
| Context Manager | Handles chunking and packing for large input contexts. | [PLANNED] |
| Retrieval Inserter | Injects top-K semantic results into the reasoner. | [PLANNED] |
| Caching Layer | Deduplicates identical requests to minimize cost. | [PLANNED] |
| Retry Controller | Manages provider-specific exponential backoff logic. | [PLANNED] |
| Safety Filter | Prevents PII leakage and blocks unsafe actions. | [PLANNED] |
| A/B Routing | Facilitates model comparison and quality experiments. | [PHASE 2] |
RiverPlan: Execution Planning
RiverPlan is the engine responsible for converting natural language instructions into precise execution steps. It handles the decomposition of complex goals into sequential tool calls.
| Sub-Service | Description | Status |
|---|---|---|
| Intent Classifier | Maps instructions to structured intent categories. | [PLANNED] |
| Prompt Parser | Processes natural language and SPL keyword hints. | [PLANNED] |
| Context Assembler | Merges schemas, governance, and user role metadata. | [ACTIVE] |
| Source Selector | Identifies which systems are required for execution. | [PLANNED] |
| Query Generator | Performs NL2SQL translation for target dialects. | [PLANNED] |
| Plan Normalizer | Standardizes plans across all data source types. | [PLANNED] |
| Plan Validator | Validates plans against schema and logic constraints. | [PLANNED] |
| Plan Repairer | Automatically fixes invalid fields in generated plans. | [PLANNED] |
| Plan Decomposer | Splits complex multi-source plans into discrete stages. | [PLANNED] |
| Federation Builder | Orchestrates cross-source join order and compute. | [PLANNED] |
| Write Plan Builder | Builds safe patterns for state-changing operations. | [PLANNED] |
| Materialization | Manages views, snapshots, and scheduled exports. | [PLANNED] |
| Hint Generator | Injects pushdown and partition hints into queries. | [PLANNED] |
| Plan Summarizer | Generates human-readable execution summaries. | [PLANNED] |
| Plan Explainer | Provides step-by-step technical reasoning details. | [PLANNED] |
| Plan Diff | Compares current execution against previous versions. | [PLANNED] |
RiverGuard: Governance Intelligence
RiverGuard ensures that every proposed action complies with the organization's security and regulatory policies. It injects constraints directly into the reasoning context.
| Sub-Service | Description | Status |
|---|---|---|
| Identity Resolver | Maps user claims to internal roles and workspaces. | [ACTIVE] |
| RBAC Evaluator | Validates permissions for specific tool executions. | [ACTIVE] |
| RLS Engine | Generates dynamic row-level security filters. | [ACTIVE] |
| Masking Engine | Applies obfuscation rules to sensitive columns. | [ACTIVE] |
| Sensitivity Classify | Identifies PII/PHI/PCI patterns in datasets. | [PLANNED] |
| Approval Gate | Manages human-in-the-loop gates for write actions. | [PLANNED] |
| Write Guard | Enforces safety rules for update and delete patterns. | [PLANNED] |
| Policy Compiler | Converts definitions into enforceable constraints. | [ACTIVE] |
| Policy Injector | Merges constraints into the active execution plan. | [ACTIVE] |
| Exception Handler | Manages break-glass flows for emergency access. | [PLANNED] |
| Quota Guard | Enforces spend caps and compute resource limits. | [PLANNED] |
| Connector Guard | Limits allowed operations per individual connector. | [PLANNED] |
| Audit Builder | Generates standardized audit events for all actions. | [PLANNED] |
| Compliance Reporter | Produces SOC2 and GDPR-ready audit exports. | [PLANNED] |
| Explanation Gen | Explains why access was permitted or blocked. | [PLANNED] |
RiverSemantic: Catalog Intelligence
RiverSemantic provides the mapping between business terminology and technical data assets. It uses vector search to identify relevant tables and columns during the reasoning loop.
| Sub-Service | Description | Status |
|---|---|---|
| Ingestion | Imports schemas and metadata from all connectors. | [ACTIVE] |
| Embedding Builder | Generates vectors for all table and column names. | [ACTIVE] |
| Semantic Retriever | Performs fast top-K matching using Qdrant. | [ACTIVE] |
| Glossary Manager | Manages business terms and their relationships. | [PLANNED] |
| Term-to-Field | Maps glossary terms to technical schema fields. | [PLANNED] |
| Entity Resolution | Builds a unified identity graph across sources. | [PLANNED] |
| Join Recommender | Identifies optimal join keys based on data similarity. | [PLANNED] |
| Change Detector | Identifies drift in source system schemas. | [PLANNED] |
| Freshness Tracker | Monitors data staleness using watermarking. | [PLANNED] |
| Quality Engine | Tracks completeness and null rate signals. | [PLANNED] |
| Relationship Infer | Automatically detects table-level relationships. | [PLANNED] |
| Sample Profiler | Prepares statistics and distributions for profiling. | [PLANNED] |
| Lineage Store | Tracks structural lineage across the ecosystem. | [PLANNED] |
| Context Packager | Prepares enriched payloads for the AI model. | [ACTIVE] |
RiverDecide: Decision Engine
RiverDecide evaluates data streams against trained ML models to automate platform operations and alerts.
| Sub-Service | Description | Status |
|---|---|---|
| Workflow Builder | Creates decision graphs for automated actions. | [PLANNED] |
| Rule Engine | Manages business thresholds and rule logic. | [PLANNED] |
| Recommend Engine | Suggests next best actions based on model output. | [PLANNED] |
| Impact Estimator | Estimates ROI and risk for proposed decisions. | [PLANNED] |
| Simulation Engine | Performs "what-if" backtesting on decision models. | [PLANNED] |
| Counterfactual | Analyzes alternative outcomes for past decisions. | [PLANNED] |
| Confidence Scorer | Assigns certainty scores to automated actions. | [PLANNED] |
| Approval Routing | Routes decisions to the appropriate stakeholders. | [PLANNED] |
| Action Selector | Targets policy-compliant actions for execution. | [PLANNED] |
| Decision Explainer | Provides a narrative for automated conclusions. | [PLANNED] |
| Outcome Tracker | Measures the effectiveness of live decisions. | [PLANNED] |
| Experiment Engine | Manages A/B tests for decision logic versions. | [PLANNED] |
| Promote Lifecycle | versioning and promotion for decision workflows. | [PLANNED] |
| Decision Registry | Central storage for all active decision assets. | [PLANNED] |
RiverOptimize: Performance Strategy
RiverOptimize identifies the most efficient execution path for each plan to minimize cost and latency.
| Sub-Service | Description | Status |
|---|---|---|
| Routing Controller | Decides between pushdown and internal compute. | [PLANNED] |
| Join Optimizer | Determines the placement of cross-source joins. | [PLANNED] |
| Staging Strategy | Manages intermediate results for large federation. | [PLANNED] |
| Layout Advisor | Advises on partitioning and clustering strategies. | [PLANNED] |
| Predicate Advisor | Optimizes filters for source-side pushdown. | [PLANNED] |
| Cost Estimator | Provides pre-execution cost predictions. | [PLANNED] |
| Latency Estimator | Predicts total runtime for complex workflows. | [PLANNED] |
| Query Rewriter | Optimizes SQL and API calls for performance. | [PLANNED] |
| Workload Shaper | Manages batching and request rate limits. | [PLANNED] |
| Cache Policy | Defines TTL and storage rules for AI caching. | [PLANNED] |
| Concurrency Guard | Limits concurrent loops per workspace/tenant. | [PLANNED] |
| Adaptive Retry | Provides source-aware retry and backoff logic. | [PLANNED] |
| Spill Strategy | Manages memory strategy for heavy aggregations. | [PLANNED] |
| Telemetry Analyzer | Learns from past runs to tune future plans. | [PLANNED] |
| Policy Learner | Auto-tunes routing based on historical data. | [PLANNED] |
RiverViz: Visualization and Rendering
RiverViz provides the components for rendering query results and architectural explanations.
| Sub-Service | Description | Status |
|---|---|---|
| Chart Recommender | Matches charts to specific data distributions. | [PLANNED] |
| Bar/Line/Area | Standard categorical and time-series charts. | [PLANNED] |
| Table Renderer | Dynamic tables with sorting and pagination. | [PLANNED] |
| Pivot Visualizer | Pivot tables and hierarchical aggregations. | [PLANNED] |
| Trend Analyzer | Visuals for seasonality and trend detection. | [PLANNED] |
| Outlier Visuals | Specialized charts for anomaly identification. | [PLANNED] |
| Correlation Map | Heatmaps and pairwise correlation matrices. | [PLANNED] |
| Map Visualizer | Geospatial data rendering on interactive maps. | [PLANNED] |
| Lineage Renderer | Graph view of data and prompt relationships. | [PLANNED] |
| Plan Graph | DAG visualization of execution workflow stages. | [PLANNED] |
| Policy Overlay | Displays masking and RLS rules in context. | [PLANNED] |
| Decision Graph | Flow diagrams for Decision Intelligence runs. | [PLANNED] |
| Status Timeline | Real-time progress and dependency visualization. | [PLANNED] |
| Training Visuals | Metric and performance charts for Model Studio. | [PLANNED] |
| Compare Visuals | side-by-side Champion vs Challenger charts. | [PLANNED] |
| Drift Visuals | Visualizing data and prediction drift events. | [PLANNED] |
| Resource Charts | GPU, CPU, and Memory usage visualization. | [PLANNED] |
| Explain Panel | Narrative explanations for system decisions. | [PLANNED] |
| Report Builder | Assembly of charts into exportable reports. | [PLANNED] |
| Data Exporters | PNG, PDF, and CSV/JSON export modules. | [PLANNED] |
RiverLearn: Adaptive Learning
RiverLearn captures and analyzes execution outcomes to improve system reasoning over time.
| Sub-Service | Description | Status |
|---|---|---|
| Memory Store | Persists plans, outcomes, and error signatures. | [PLANNED] |
| Feedback Loop | Captures user thumbs-up/down and corrections. | [PLANNED] |
| Pattern Extractor | Identifies successful multi-step workflow plans. | [PLANNED] |
| Quality Scorer | Scores plans based on validity and efficiency. | [PLANNED] |
| Routing Learner | Tunes pushdown decisions using past performance. | [PLANNED] |
| Staging Learner | Optimizes intermediate data handling strategies. | [PLANNED] |
| Semantic Reinforce | Improves schema matching using user corrections. | [PLANNED] |
| Governance Learner | Tracks and learns from policy violation patterns. | [PLANNED] |
| Automation Learner | Tunes retry logic based on success signatures. | [PLANNED] |
| Model Learner | correlates training data changes with accuracy. | [PLANNED] |
| Memory Profiles | Org-specific reasoning and dictionary profiles. | [PLANNED] |
| Test Gen | Converts failures into validated test cases. | [PLANNED] |
RiverObserve: Operational Monitoring
RiverObserve provides platform-wide observability and operational health tracking.
| Sub-Service | Description | Status |
|---|---|---|
| Trace Store | Full request/response logging across services. | [PHASE 1] |
| Metrics Aggregator | Real-time SLI and SLO tracking for AI calls. | [PLANNED] |
| Alerting Engine | Notification engine for system incidents. | [PLANNED] |
| Health Monitor | Tracks data source and LLM provider health. | [PHASE 1] |
| Job Monitor | Monitoring for Temporal workflows and queues. | [PHASE 1] |
| Audit Search | Indexed search for action and audit history. | [PLANNED] |
| Cost Aggregator | Token and compute spend tracking by tenant. | [PLANNED] |
| SLA Tracker | Monitors response time and data freshness goals. | [PLANNED] |
| Incident Correlate | Identifies root causes from cross-service logs. | [PLANNED] |
Infrastructure Specifications
The Intelligence Stack relies on several high-performance infrastructure components to maintain its reasoning capabilities.
| Component | Role | Technology |
|---|---|---|
| Vector Store | Semantic catalog search | Qdrant (Port 6333) |
| Workflow Engine | Reasoning orchestration | Temporal (Port 7233) |
| Relational DB | Metadata persistence | PostgreSQL (Port 5433) |
| Document Store | Connector configuration | MongoDB (Port 27017) |
| Cache Layer | Session and turn state | Redis (Port 6381) |
| Object Storage | File and artifact storage | MinIO/S3 (Port 9002) |
LLM Implementation Details
PSA interacts with multiple Large Language Models via the RiverCore abstraction layer.
| Logic | Specification |
|---|---|
| Orchestration | Agentic loop with tool-calling support. |
| Output Type | Strict JSON validation via Pydantic. |
| Model Selection | Per-turn complexity-based routing. |
| Fallback | provider fallback chain per category. |

Limitations and Constraints
Developers must build within the following technical constraints for the current stack version.
- Reasoning Overhead: Distributed tool calls add 1.2s minimum latency per turn.
- Context Boundaries: Model reasoning is restricted by the provider's token window (typ 128k).
- Concurrency Guard: Global limit of 50 concurrent loops per instance.
- Provider SLA: Total execution time is subject to the availability of external LLM APIs.