Scaling & Reliability Patterns

Async Workers: Custom async worker pool processes simulation jobs
Celery Alternative: Can use Celery for more complex task scheduling
Resource Isolation: Workers run in separate processes/containers
Priority Queue: High-priority simulations can be processed first

Page Outline

DIA is designed for horizontal scaling and high availability using several proven patterns.

Partitioning Strategy: Each Kafka topic is partitioned to enable parallel processing:
- decision.simulation.request: Partitioned by tenant_id for tenant isolation
- train.results: Partitioned by model_id for model-specific processing
- model.metrics: Partitioned by tenant_id for tenant-scoped metrics
Consumer Groups: Multiple consumer instances in the same group process different partitions:
- dia-simulator-group: 3 instances process 3 partitions in parallel
- dia-group: 2 instances handle general events
- Enables linear scaling: add more consumers to increase throughput

Heavy simulations are processed off the FastAPI request thread:

Redis is used for multiple caching layers:

Cache invalidation:

All simulation requests support idempotency keys:

External service calls use circuit breakers:

Scaling & Reliability Patterns​