Skip to main content

Deployment Workflows

Model Studio ensures a seamless transition from a "Trained" artifact to a "Deployed" prediction endpoint. This document outlines the lifecycle of a model post-training and how the Deployment Service manages state and health.


The Deployment Lifecycle

Once a training job is marked as COMPLETED, the model artifact is stored in the Model Registry. To make the model available for live inference, it must be promoted through the deployment pipeline.


1. Promoting to Staging

Before a model is deployed to production, it is typically deployed to a Staging Environment for automated validation.

  • Endpoint: Deploy Trained Model (Internal API)
  • Payload: {"environment": "staging"}

The Deployment Service will:

  1. Provision the required Compute Resources.
  2. Load the model artifact from the Spatial Bucket.
  3. Initialize the Inference API wrapper.
  4. Run a series of "Smoke Tests" to verify endpoint availability.

2. Health & Status Monitoring

A deployed model's state is continuously synchronized via background workers.

StatusMeaning
provisioningThe infrastructure is being allocated.
startingThe model is loading into memory.
healthyThe endpoint is accepting prediction requests.
degradedThe model is responding but with high latency or errors.
failedThe endpoint is unreachable.

Checking Health

Use the internal Deployment Registry to retrieve the current health state and replica count.


3. Scaling Configuration

Model Studio supports Autoscaling based on request volume and resource utilization.

{
"scaling_config": {
"min_replicas": 2,
"max_replicas": 10,
"target_cpu_utilization": 60,
"cooldown_period": 300
}
}
  • Vertical Scaling: Managed via instance_type (e.g., moving from cpu.standard to gpu.xlarge).
  • Horizontal Scaling: Managed via min/max_replicas.

4. Rollback & Versioning

If a newly deployed model version performs poorly, Model Studio allows for immediate rollback to the previous stable version.

  1. Identify the stable version ID in the Registry.
  2. Submit a deployment request for that specific version.
  3. The Deployment Service will perform a Blue-Green Deployment to replace the unhealthy version with zero downtime.

Production Readiness

Ensure that enable_monitoring is set to true for all production deployments to receive telemetry alerts in the Decision Intelligence dashboard.