AI Tools and Technologies

Page Outline

Overview
Use Cases
RiverCore: Multi-Provider Control
- Model Capability Categories
- RiverCore Sub-Service Inventory
RiverPlan: Execution Planning
RiverGuard: Governance Intelligence
RiverSemantic: Catalog Intelligence
RiverDecide: Decision Engine
RiverOptimize: Performance Strategy
RiverViz: Visualization and Rendering
RiverLearn: Adaptive Learning
RiverObserve: Operational Monitoring
Infrastructure Specifications
- LLM Implementation Details
Limitations and Constraints
Related

This document provides a comprehensive technical reference for the RiverGen Intelligence Stack, including detailed sub-service inventories and provider-routing specifications.

Overview

The RiverGen Intelligence layer is a modular architecture composed of domain-specific agents and stateless AI services. This separation allows for high-performance reasoning while maintaining a provider-agnostic infrastructure.

The stack follows the primary engineering rule: Agents coordinate reasoning workflows, AI Services provide specialized intelligence, and Compute layers execute the final physical operations.

Intelligence Stack Overview

Use Cases

The tools within the Intelligence Stack are designed to support a broad range of automated data engineering and analytics tasks.

Cross-Source Federation: Planning and executing join operations across SQL, NoSQL, and Cloud Data Warehouses.
Resilient AI Execution: Implementing multi-provider fallback chains to ensure execution availability during LLM outages.
Automated Cataloging: Using vector-based semantic search to discover and map technical assets to business terminology.
Governed Reasoning: Ensuring that every turn in an agentic loop is subject to policy-aware validation and safety gates.

RiverCore: Multi-Provider Control

RiverCore is the foundational service that manages all Large Language Model (LLM) provider interactions. It abstracts the underlying complexity of specific SDKs into a unified platform interface.

![Routing Decision Flow](/img/AI-documentation/psa/routing decision flow.png)

Model Capability Categories

To optimize performance and cost, RiverCore classifies models into four capability categories. The system selects the best available model for each specific turn in an agentic loop.

Fast: Quick classification and pattern matching (e.g., Gemini Flash Lite, GPT-4o-mini).
Balanced: Standard query generation and structured reasoning (e.g., Gemini Flash, Claude Sonnet).
Reasoning: Complex joins, federation, and error recovery (e.g., Gemini Pro, o3, Claude Opus).
Coding: Specialized SQL/NoSQL generation and dialect-specific optimization (e.g., GPT-4o, DeepSeek-V3).

RiverCore Sub-Service Inventory

Sub-Service	Description	Status
Provider Registry	Manages LLM provider registration and healthy checks.	[ACTIVE]
Complexity Router	Performs per-turn model selection based on task tier.	[ACTIVE]
Tool Call Adapter	Normalizes function-calling formats across providers.	[ACTIVE]
Output Normalizer	Ensures canonical JSON responses regardless of model.	[ACTIVE]
Fallback Controller	Manages automatic provider failover within a category.	[ACTIVE]
Cost Metering	Tracks token usage and operational costs per tenant.	[ACTIVE]
Latency Telemetry	Monitors end-to-end response times for AI calls.	[ACTIVE]
Policy Model Selector	Selects models based on sensitivity and tenant rules.	[PHASE 2]
Prompt Library	Centralized management of specialized prompt templates.	[PLANNED]
Context Manager	Handles chunking and packing for large input contexts.	[PLANNED]
Retrieval Inserter	Injects top-K semantic results into the reasoner.	[PLANNED]
Caching Layer	Deduplicates identical requests to minimize cost.	[PLANNED]
Retry Controller	Manages provider-specific exponential backoff logic.	[PLANNED]
Safety Filter	Prevents PII leakage and blocks unsafe actions.	[PLANNED]
A/B Routing	Facilitates model comparison and quality experiments.	[PHASE 2]

RiverPlan: Execution Planning

RiverPlan is the engine responsible for converting natural language instructions into precise execution steps. It handles the decomposition of complex goals into sequential tool calls.

Sub-Service	Description	Status
Intent Classifier	Maps instructions to structured intent categories.	[PLANNED]
Prompt Parser	Processes natural language and SPL keyword hints.	[PLANNED]
Context Assembler	Merges schemas, governance, and user role metadata.	[ACTIVE]
Source Selector	Identifies which systems are required for execution.	[PLANNED]
Query Generator	Performs NL2SQL translation for target dialects.	[PLANNED]
Plan Normalizer	Standardizes plans across all data source types.	[PLANNED]
Plan Validator	Validates plans against schema and logic constraints.	[PLANNED]
Plan Repairer	Automatically fixes invalid fields in generated plans.	[PLANNED]
Plan Decomposer	Splits complex multi-source plans into discrete stages.	[PLANNED]
Federation Builder	Orchestrates cross-source join order and compute.	[PLANNED]
Write Plan Builder	Builds safe patterns for state-changing operations.	[PLANNED]
Materialization	Manages views, snapshots, and scheduled exports.	[PLANNED]
Hint Generator	Injects pushdown and partition hints into queries.	[PLANNED]
Plan Summarizer	Generates human-readable execution summaries.	[PLANNED]
Plan Explainer	Provides step-by-step technical reasoning details.	[PLANNED]
Plan Diff	Compares current execution against previous versions.	[PLANNED]

RiverGuard: Governance Intelligence

RiverGuard ensures that every proposed action complies with the organization's security and regulatory policies. It injects constraints directly into the reasoning context.

Sub-Service	Description	Status
Identity Resolver	Maps user claims to internal roles and workspaces.	[ACTIVE]
RBAC Evaluator	Validates permissions for specific tool executions.	[ACTIVE]
RLS Engine	Generates dynamic row-level security filters.	[ACTIVE]
Masking Engine	Applies obfuscation rules to sensitive columns.	[ACTIVE]
Sensitivity Classify	Identifies PII/PHI/PCI patterns in datasets.	[PLANNED]
Approval Gate	Manages human-in-the-loop gates for write actions.	[PLANNED]
Write Guard	Enforces safety rules for update and delete patterns.	[PLANNED]
Policy Compiler	Converts definitions into enforceable constraints.	[ACTIVE]
Policy Injector	Merges constraints into the active execution plan.	[ACTIVE]
Exception Handler	Manages break-glass flows for emergency access.	[PLANNED]
Quota Guard	Enforces spend caps and compute resource limits.	[PLANNED]
Connector Guard	Limits allowed operations per individual connector.	[PLANNED]
Audit Builder	Generates standardized audit events for all actions.	[PLANNED]
Compliance Reporter	Produces SOC2 and GDPR-ready audit exports.	[PLANNED]
Explanation Gen	Explains why access was permitted or blocked.	[PLANNED]

RiverSemantic: Catalog Intelligence

RiverSemantic provides the mapping between business terminology and technical data assets. It uses vector search to identify relevant tables and columns during the reasoning loop.

Sub-Service	Description	Status
Ingestion	Imports schemas and metadata from all connectors.	[ACTIVE]
Embedding Builder	Generates vectors for all table and column names.	[ACTIVE]
Semantic Retriever	Performs fast top-K matching using Qdrant.	[ACTIVE]
Glossary Manager	Manages business terms and their relationships.	[PLANNED]
Term-to-Field	Maps glossary terms to technical schema fields.	[PLANNED]
Entity Resolution	Builds a unified identity graph across sources.	[PLANNED]
Join Recommender	Identifies optimal join keys based on data similarity.	[PLANNED]
Change Detector	Identifies drift in source system schemas.	[PLANNED]
Freshness Tracker	Monitors data staleness using watermarking.	[PLANNED]
Quality Engine	Tracks completeness and null rate signals.	[PLANNED]
Relationship Infer	Automatically detects table-level relationships.	[PLANNED]
Sample Profiler	Prepares statistics and distributions for profiling.	[PLANNED]
Lineage Store	Tracks structural lineage across the ecosystem.	[PLANNED]
Context Packager	Prepares enriched payloads for the AI model.	[ACTIVE]

RiverDecide: Decision Engine

RiverDecide evaluates data streams against trained ML models to automate platform operations and alerts.

Sub-Service	Description	Status
Workflow Builder	Creates decision graphs for automated actions.	[PLANNED]
Rule Engine	Manages business thresholds and rule logic.	[PLANNED]
Recommend Engine	Suggests next best actions based on model output.	[PLANNED]
Impact Estimator	Estimates ROI and risk for proposed decisions.	[PLANNED]
Simulation Engine	Performs "what-if" backtesting on decision models.	[PLANNED]
Counterfactual	Analyzes alternative outcomes for past decisions.	[PLANNED]
Confidence Scorer	Assigns certainty scores to automated actions.	[PLANNED]
Approval Routing	Routes decisions to the appropriate stakeholders.	[PLANNED]
Action Selector	Targets policy-compliant actions for execution.	[PLANNED]
Decision Explainer	Provides a narrative for automated conclusions.	[PLANNED]
Outcome Tracker	Measures the effectiveness of live decisions.	[PLANNED]
Experiment Engine	Manages A/B tests for decision logic versions.	[PLANNED]
Promote Lifecycle	versioning and promotion for decision workflows.	[PLANNED]
Decision Registry	Central storage for all active decision assets.	[PLANNED]

RiverOptimize: Performance Strategy

RiverOptimize identifies the most efficient execution path for each plan to minimize cost and latency.

Sub-Service	Description	Status
Routing Controller	Decides between pushdown and internal compute.	[PLANNED]
Join Optimizer	Determines the placement of cross-source joins.	[PLANNED]
Staging Strategy	Manages intermediate results for large federation.	[PLANNED]
Layout Advisor	Advises on partitioning and clustering strategies.	[PLANNED]
Predicate Advisor	Optimizes filters for source-side pushdown.	[PLANNED]
Cost Estimator	Provides pre-execution cost predictions.	[PLANNED]
Latency Estimator	Predicts total runtime for complex workflows.	[PLANNED]
Query Rewriter	Optimizes SQL and API calls for performance.	[PLANNED]
Workload Shaper	Manages batching and request rate limits.	[PLANNED]
Cache Policy	Defines TTL and storage rules for AI caching.	[PLANNED]
Concurrency Guard	Limits concurrent loops per workspace/tenant.	[PLANNED]
Adaptive Retry	Provides source-aware retry and backoff logic.	[PLANNED]
Spill Strategy	Manages memory strategy for heavy aggregations.	[PLANNED]
Telemetry Analyzer	Learns from past runs to tune future plans.	[PLANNED]
Policy Learner	Auto-tunes routing based on historical data.	[PLANNED]

RiverViz: Visualization and Rendering

RiverViz provides the components for rendering query results and architectural explanations.

Sub-Service	Description	Status
Chart Recommender	Matches charts to specific data distributions.	[PLANNED]
Bar/Line/Area	Standard categorical and time-series charts.	[PLANNED]
Table Renderer	Dynamic tables with sorting and pagination.	[PLANNED]
Pivot Visualizer	Pivot tables and hierarchical aggregations.	[PLANNED]
Trend Analyzer	Visuals for seasonality and trend detection.	[PLANNED]
Outlier Visuals	Specialized charts for anomaly identification.	[PLANNED]
Correlation Map	Heatmaps and pairwise correlation matrices.	[PLANNED]
Map Visualizer	Geospatial data rendering on interactive maps.	[PLANNED]
Lineage Renderer	Graph view of data and prompt relationships.	[PLANNED]
Plan Graph	DAG visualization of execution workflow stages.	[PLANNED]
Policy Overlay	Displays masking and RLS rules in context.	[PLANNED]
Decision Graph	Flow diagrams for Decision Intelligence runs.	[PLANNED]
Status Timeline	Real-time progress and dependency visualization.	[PLANNED]
Training Visuals	Metric and performance charts for Model Studio.	[PLANNED]
Compare Visuals	side-by-side Champion vs Challenger charts.	[PLANNED]
Drift Visuals	Visualizing data and prediction drift events.	[PLANNED]
Resource Charts	GPU, CPU, and Memory usage visualization.	[PLANNED]
Explain Panel	Narrative explanations for system decisions.	[PLANNED]
Report Builder	Assembly of charts into exportable reports.	[PLANNED]
Data Exporters	PNG, PDF, and CSV/JSON export modules.	[PLANNED]

RiverLearn: Adaptive Learning

RiverLearn captures and analyzes execution outcomes to improve system reasoning over time.

Sub-Service	Description	Status
Memory Store	Persists plans, outcomes, and error signatures.	[PLANNED]
Feedback Loop	Captures user thumbs-up/down and corrections.	[PLANNED]
Pattern Extractor	Identifies successful multi-step workflow plans.	[PLANNED]
Quality Scorer	Scores plans based on validity and efficiency.	[PLANNED]
Routing Learner	Tunes pushdown decisions using past performance.	[PLANNED]
Staging Learner	Optimizes intermediate data handling strategies.	[PLANNED]
Semantic Reinforce	Improves schema matching using user corrections.	[PLANNED]
Governance Learner	Tracks and learns from policy violation patterns.	[PLANNED]
Automation Learner	Tunes retry logic based on success signatures.	[PLANNED]
Model Learner	correlates training data changes with accuracy.	[PLANNED]
Memory Profiles	Org-specific reasoning and dictionary profiles.	[PLANNED]
Test Gen	Converts failures into validated test cases.	[PLANNED]

RiverObserve: Operational Monitoring

RiverObserve provides platform-wide observability and operational health tracking.

Sub-Service	Description	Status
Trace Store	Full request/response logging across services.	[PHASE 1]
Metrics Aggregator	Real-time SLI and SLO tracking for AI calls.	[PLANNED]
Alerting Engine	Notification engine for system incidents.	[PLANNED]
Health Monitor	Tracks data source and LLM provider health.	[PHASE 1]
Job Monitor	Monitoring for Temporal workflows and queues.	[PHASE 1]
Audit Search	Indexed search for action and audit history.	[PLANNED]
Cost Aggregator	Token and compute spend tracking by tenant.	[PLANNED]
SLA Tracker	Monitors response time and data freshness goals.	[PLANNED]
Incident Correlate	Identifies root causes from cross-service logs.	[PLANNED]

Infrastructure Specifications

The Intelligence Stack relies on several high-performance infrastructure components to maintain its reasoning capabilities.

Component	Role	Technology
Vector Store	Semantic catalog search	Qdrant (Port 6333)
Workflow Engine	Reasoning orchestration	Temporal (Port 7233)
Relational DB	Metadata persistence	PostgreSQL (Port 5433)
Document Store	Connector configuration	MongoDB (Port 27017)
Cache Layer	Session and turn state	Redis (Port 6381)
Object Storage	File and artifact storage	MinIO/S3 (Port 9002)

LLM Implementation Details

PSA interacts with multiple Large Language Models via the RiverCore abstraction layer.

Logic	Specification
Orchestration	Agentic loop with tool-calling support.
Output Type	Strict JSON validation via Pydantic.
Model Selection	Per-turn complexity-based routing.
Fallback	provider fallback chain per category.

![Agent Design Loop](/img/AI-documentation/psa/agent Loop.png)

Limitations and Constraints

Developers must build within the following technical constraints for the current stack version.

Reasoning Overhead: Distributed tool calls add 1.2s minimum latency per turn.
Context Boundaries: Model reasoning is restricted by the provider's token window (typ 128k).
Concurrency Guard: Global limit of 50 concurrent loops per instance.
Provider SLA: Total execution time is subject to the availability of external LLM APIs.

Overview​

Use Cases​

RiverCore: Multi-Provider Control​

Model Capability Categories​

RiverCore Sub-Service Inventory​

RiverPlan: Execution Planning​

RiverGuard: Governance Intelligence​

RiverSemantic: Catalog Intelligence​

RiverDecide: Decision Engine​

RiverOptimize: Performance Strategy​

RiverViz: Visualization and Rendering​

RiverLearn: Adaptive Learning​

RiverObserve: Operational Monitoring​

Infrastructure Specifications​

LLM Implementation Details​

Limitations and Constraints​

Related​