Skip to main content

Discover Schema

Sprint 3

Discover and cache the database schema for a data source. This is an async operation that runs in a thread pool executor and may take time for large databases.

Endpoint

POST /api/v1/data-sources/{data_source_id}/discover-schema

Headers

HeaderRequiredDescription
AuthorizationYesBearer <access_token>

Path Parameters

ParameterTypeRequiredDescription
data_source_idintegerYesData source ID

Query Parameters

ParameterTypeRequiredDescription
schema_namestringNoSpecific schema to discover (if not provided, discovers all schemas)
refreshbooleanNoForce refresh even if schema is already cached (default: false)

Response

Success (200)

{
"success": true,
"data": {
"data_source_id": 1,
"schemas_discovered": [
{
"schema_name": "public",
"tables_count": 25,
"views_count": 5
},
{
"schema_name": "analytics",
"tables_count": 10,
"views_count": 2
}
],
"total_tables": 35,
"total_columns": 450,
"discovered_at": "2024-12-01T11:00:00Z",
"discovery_duration_ms": 5234
},
"message": "Schema discovery completed"
}

Error Codes

StatusCodeDescription
401UNAUTHORIZEDInvalid or missing authentication token
403FORBIDDENUser is not a member of any organization
404NOT_FOUNDData source not found

Features

  • Async operation (runs in thread pool executor)
  • Discovers all schemas or specific schema
  • Caches discovered schema for performance
  • Force refresh option
  • Returns discovery statistics
  • May take time for large databases

Important Notes

  • This operation is asynchronous and may take time for large databases
  • Schema discovery runs in a thread pool executor to prevent blocking the event loop
  • Discovered schema is cached and used for query generation
  • Use refresh=true to force re-discovery even if schema is already cached

Example

curl -X POST "https://api.rivergen.com/api/v1/data-sources/1/discover-schema?schema_name=public&refresh=true" \
-H "Authorization: Bearer <access_token>"