Cerebrium is a serverless infrastructure platform for building and scaling data and AI workloads.

Cerebrium abstracts the infrastructure complexity so you can focus on building AI products users love!

Getting Started

Setting up and deploying an app on Cerebrium takes just a few steps:

1. Install the CLI

pip install cerebrium
cerebrium login  # Redirects to your browser for authentication

2. Initialize a Project

cerebrium init my-first-app
cd my-first-app

This creates a basic project with main.py for app code and cerebrium.toml for configuration. This is was the main.py file contains:

def run(prompt: str):
    print(f"Running on Cerebrium: {prompt}")

    return {"my_result": prompt}

3. Run code remotely

We can then run this function in the cloud and pass it a prompt.

cerebrium run main.py::run --prompt "Hello World!"

Your should see logs that output the prompt you sent in - this is running in the cloud! Let us now turn this into a scalable REST endpoint.

4. Deploy your app

Run the following command:

cerebrium deploy

This will turn the function into a callable endpoint that accepts json parameters (prompt) and can scale to 1000s of requests automatically!

Once deployed, an app becomes callable through a POST endpoint https://api.cortex.cerebrium.ai/v4/{project-id}/{app-name}/{function-name} and takes a json parameter, prompt

Great! You made it! Join our Community Discord for support and updates.

How It Works

Cerebrium uses containerization to ensure consistent environments and reliable scaling for apps. When code is deployed, Cerebrium packages it with all necessary dependencies into a container image. This image serves as a blueprint for creating instances that handle incoming requests. The system automatically manages scaling, creating new instances when traffic increases and removing them during quiet periods.

For a detailed explanation of how Cerebrium builds and manages container images, see our Defining Container Images Guide.

Content-Aware Storage forms the foundation of Cerebrium’s speed. This system intelligently manages container images by understanding their content structure. When launching new instances, it pulls only the specific files. This targeted approach significantly reduces cold start times and optimizes resource usage.