Partner Services are available from CLI version 1.39.0 and greater
Benefits of Partner Services
Partner Services provide several advantages:- Quick and easy deployment
- Independent scaling of each service
- Reduced costs by running models on Cerebrium’s optimized runtime
- Reduced latency by running models on the same network as the app
- Deploy to specific regions for data compliance and latency requirements
Getting Started
To use Partner Services, configure service-specific requirements through Cerebrium’s platform. Each partner service has unique configuration needs - refer to individual service pages linked above for detailed requirements, which may include:- API keys and authentication details
- Service-specific configuration parameters
- Resource requirements and limitations
Scaling and Concurrency
Partner Services support independent scaling configurations:- Use the
min_replicas
andmax_replicas
parameters to control the number of instances - The
replica_concurrency
parameter determines how many concurrent requests each instance can handle - Adjust the
cooldown
parameter to control how long instances remain active after processing requests - Adjust the
hardware
section to control the instance type which affects performance and/or cost