- Launch your code in the cloud in seconds
- Define your own containers environments or bring your own Dockerfile
- Run CPUs or GPUs at scale—with support for thousands of concurrent containers
- Scale based on concurrency, Requests per second or CPU/Memory utilization,
- Serve WebSockets, REST APIs, or any ASGI-compatible app
- Store model weights, files, and more with distributed storage
- Pay only for the compute you use — billed by the second
Getting Started
Setting up and deploying an app on Cerebrium takes just a few steps:1. Install the CLI
2. Initialize a Project
main.py
for app code and cerebrium.toml
for configuration. This is was the main.py file contains:
3. Run code remotely
We can then run this function in the cloud and pass it a prompt.4. Deploy your app
Run the following command:https://api.aws.us-east-1.cerebrium.ai/v4/{project-id}/{app-name}/{function-name}
and takes a json parameter, prompt
Great! You made it! Join our Community Discord for support and updates.
How It Works
Cerebrium uses containerization to ensure consistent environments and reliable scaling for apps. When code is deployed, Cerebrium packages it with all necessary dependencies into a container image. This image serves as a blueprint for creating instances that handle incoming requests. The system automatically manages scaling, creating new instances when traffic increases and removing them during quiet periods. For a detailed explanation of how Cerebrium builds and manages container images, see our Defining Container Images Guide.Content-Aware Storage forms the foundation of Cerebrium’s speed. This system
intelligently manages container images by understanding their content
structure. When launching new instances, it pulls only the specific files.
This targeted approach significantly reduces cold start times and optimizes
resource usage.