Deployment Workflows
Model Studio ensures a seamless transition from a "Trained" artifact to a "Deployed" prediction endpoint. This document outlines the lifecycle of a model post-training and how the Deployment Service manages state and health.
The Deployment Lifecycle
Once a training job is marked as COMPLETED, the model artifact is stored in the Model Registry. To make the model available for live inference, it must be promoted through the deployment pipeline.
1. Promoting to Staging
Before a model is deployed to production, it is typically deployed to a Staging Environment for automated validation.
- Endpoint: Deploy Trained Model (Internal API)
- Payload:
{"environment": "staging"}
The Deployment Service will:
- Provision the required Compute Resources.
- Load the model artifact from the Spatial Bucket.
- Initialize the Inference API wrapper.
- Run a series of "Smoke Tests" to verify endpoint availability.
2. Health & Status Monitoring
A deployed model's state is continuously synchronized via background workers.
| Status | Meaning |
|---|---|
provisioning | The infrastructure is being allocated. |
starting | The model is loading into memory. |
healthy | The endpoint is accepting prediction requests. |
degraded | The model is responding but with high latency or errors. |
failed | The endpoint is unreachable. |
Checking Health
Use the internal Deployment Registry to retrieve the current health state and replica count.
3. Scaling Configuration
Model Studio supports Autoscaling based on request volume and resource utilization.
{
"scaling_config": {
"min_replicas": 2,
"max_replicas": 10,
"target_cpu_utilization": 60,
"cooldown_period": 300
}
}
- Vertical Scaling: Managed via
instance_type(e.g., moving fromcpu.standardtogpu.xlarge). - Horizontal Scaling: Managed via
min/max_replicas.
4. Rollback & Versioning
If a newly deployed model version performs poorly, Model Studio allows for immediate rollback to the previous stable version.
- Identify the stable version ID in the Registry.
- Submit a deployment request for that specific version.
- The Deployment Service will perform a Blue-Green Deployment to replace the unhealthy version with zero downtime.
Ensure that enable_monitoring is set to true for all production deployments to receive telemetry alerts in the Decision Intelligence dashboard.