Multi-region deployment is currently in beta. Rapid updates and
improvements will be made over the next few months to bring full functionality
to life. Please reach out on our Discord
about features/functionality you would like to see
Why Use Multi-Region Deployment
- Reduced Latency: Deploy closer to users for faster response times and better experience with real-time applications like voice agents and LLMs
- Data Residency: Meet data protection requirements by keeping sensitive data within specific geographic regions to comply with regulations like GDPR and CCPA
- High Availability: Ensure fault tolerance and continuous service through geographic redundancy, disaster recovery, and load balancing across multiple regions
Available Regions
Cerebrium supports deployment across three major continents with the following regions:United States
- us-east-1 (N. Virginia) - Default region
Europe
- eu-west-2 (United Kingdom)
Asia Pacific
- ap-south-1 (India) - Coming soon
Additional regions are being evaluated and will be added based on user demand
and infrastructure availability. Contact
support if you need deployment in a specific
region not currently listed.
CLI Configuration
You can configure your CLI to work with different regions in two ways: Set a default region for your CLI sessions that will be used for all commands:--region
flag:
The
--region
flag takes precedence over the default region set with
cerebrium region set
. This allows you to temporarily use a different region
without changing your default configuration.App Deployment
Configure your app’s deployment region using theregion
parameter in the [cerebrium.hardware]
section of your cerebrium.toml
file:
Pricing
Pricing varies by region based on local infrastructure costs and availability:GPU Availability by Region
GPU availability and pricing vary across regions due to infrastructure constraints and local demand:GPU Model | US East | EU West | AP South |
---|---|---|---|
HOPPER_H100 | ✅ | ✅ | ✅ |
AMPERE_A100_40GB | ✅ | ✅ | ✅ |
ADA_L40S | ✅ | ❌ | ❌ |
ADA_L4 | ✅ | ✅ | ✅ |
AMPERE_A10 | ✅ | ✅ | ✅ |
TURING_T4 | ✅ | ✅ | ✅ |
Limitations
- Each region requires a separate deployment. Apps deployed in one region do not automatically replicate to other regions.
- Each region has its own isolated persistent storage volume. Data stored in
/persistent-storage
in one region is not accessible from other regions.