1. Install the CLI

2. Initialize a Project

3. Run code remotely

4. Deploy your app

Getting Started

How It Works

Getting started on the Cerebrium platform

Introduction

Cerebrium

Dashboard

Blog

Community

Status

Pricing

Examples

Migrations

TOML Reference

API Reference

Sign Up

Contact Us

Learn how to manage your team on the platform

Collaborating on Cerebrium

Defining Container Images

Custom Python Web Servers

Run generic containerized applications on Cerebrium using your own custom Dockerfiles.

Custom Dockerfiles

Using GPUs

Using CUDA

CPU and Memory

Learn to optimise for cost and performance by scaling out apps

Scaling Apps

Improve throughput and cost performance with batching and concurrency

Batching and Concurrency

Integrate Cerebrium into your CI/CD workflow for automated deployments

CI/CD Pipelines

Control the transition between revisions during deployments

Gradual Roll-out

Deploy your apps globally across multiple regions for improved latency and data residency compliance

Multi-Region Deployment

OpenAI-Compatible Endpoints

Make HTTP requests to your Cerebrium endpoints

REST API

Streaming Endpoints

WebSocket Endpoints

Webhook Forwarding

Execute calls to a Cerebrium app to be run asynchronously

Async requests

Managing Files

Deploy specialized services from Cerebrium's partners with simplified configurations

Deploy Deepgram speech-to-text services on Cerebrium

Deepgram

Deploy Rime text-to-speech services on Cerebrium

Rime

Cerebrium follows security best practices

Security & Data Privacy

Access third-party platforms using secure credentials encrypted on Cerebrium

Using Secrets

Control request and response logs in your Cerebrium apps

Request and Response Logging

Decrease the time it takes start your application

Faster Cold Starts

How to calculate the cost of your deployment on Cerebrium

Calculating compute cost

Explore our collection of implementation examples and tutorials

Featured Examples

Mistral 7B with vLLM

Create a OpenAI compatible endpoint using the vLLM framework

OpenAI compatible vLLM endpoint

Stream outputs live from Falcon 7B using SSE

Streaming LLM Output

Deploy an executive assistant using Langsmith and Langchain

Langchain and Langsmith

Using Distill Whisper to transcribe an audio file

Transcribe 1 hour podcast

Integrate a real-time AI voice agent with Twilio

Twilio Voice Agent with PipeCat

Real-time Voice Agent

Create an Outbound AI agent that can transfer calls to real agents

Outbound Agent with LiveKit

ComfyUI application at Scale

Generate high quality images using SDXL with refiner

Generate Images using SDXL

Using FastAPI, Gradio and Cerebrium to deploy an LLM chat interface

Gradio Chat Interface

Run a hyperparameter sweep on Llama 3.2 with WandB

Hyperparameter Sweep training Llama 3.2 with WandB

Complete reference for all parameters available in Cerebrium's default`cerebrium.toml` configuration file.

Deploy a Model from Replicate on Cerebrium

Migrating from Replicate

Deploy a Model from Hugging Face on Cerebrium

Migrating from Hugging Face

Migrating from Mystic


                A valid Session Token is required to authorize API requests. You can get a Session Token by using the OAuth2 refresh token flow.<br/>
                You can get your existing Refresh Token from the [Cerebrium Dashboard](https://dashboard.cerebrium.ai/) on the API Keys page.<br/>
				Making a request to this endpoint will exchange your Refresh Token for a new Session Token.
            

Authorization Token

Check Build Service Status

Retrieve a list of builds for a specific app.

List Builds

Get Build

Cancel Build

Download the build ZIP file for a specific build.

Download Build

Retrieve logs for a specific build of an app.

List Build Logs

Rebuild Build

Retrieve the contents of a build ZIP file.

Get Build Zip Contents

List Hardware

Retrieve a list of invitations for a specific user.

List Invitations

Accept or reject an invitation to join a project.

Respond to Invitation

Retrieve a list of users for a specific project.

List Project Users

Invite User

Remove User

List Plans

List Projects

Create Project

Retrieve details of a specific project by its ID.

Get Project

Delete Project

Update the configuration or metadata of a specific project.

Modify Project

List recent containers for a specific project.

List Containers

Retrieve cost details for a specific project.

Get Project Cost

Update the onboarding details of a specific project.

Modify Project Onboarding

Retrieve a list of Inference API keys for a specific project.

List API Keys

Create a new Inference API key for a specific project.

Create API Key

Retrieve a list of apps for a specific project.

List Apps

Create App

Retrieve details for a specific app in a project.

Get App

Delete App

Update the configuration or metadata of a specific app.

Modify App

Retrieve the active revision for a specific app.

Get Active Revision

Retrieve a list of recent containers for a specific app.

List App Containers

Retrieve a list of containers for a specific app.

Retrieve details for a specific container for an app.

Get App Container

Stop App Container

Retrieve dashboard metrics for a specific app.

Get App Metrics

Get App Logs

Retrieve resource metrics for a specific app over a time period.

Get App Resource Metrics (CPU, Memory, GPU)

Create a new partner app for a specific project.

Retrieve a list of runs for a specific app.

List Runs

Retrieve the number of queued runs for a specific app.

Count Queued Runs

Retrieve details for a specific run of an app.

Get Run

Cancel Run

Retrieve a list of secrets for a specific app.

List App Secrets

Update App Secrets

Retrieve a list of secrets for a specific project.

List Secrets

Update Secrets

Retrieve the invoices for a specific project.

Get Project Invoices

Retrieve the payment methods for a specific project.

Get Project Payment Methods

Remove a payment method from a specific project.

Remove Project Payment Method

Retrieve the payment URL for a specific project.

Get Project Payment URL

Retrieve the subscription details for a specific project.

Get Project Subscription

Modify the subscription plan for a specific project.

Change Project Plan

Retrieve a list of volumes for a specific project.

List Volumes

Resize Volume

Finalize the file upload process to a specific volume. Intended for internal use - rather use the `cerebrium cp` command.

Complete File Upload

Begin the file upload process to a specific volume. Intended for internal use - rather use the `cerebrium cp` command.

Initialize File Upload

Download File

Retrieve a list of files in a specified volume. Intended for internal use - rather use the `cerebrium ls` command.

List Files

Remove a file from a specific volume. Intended for internal use - rather use the `cerebrium rm` command.

Getting Started

Container Images

GPUs and Compute Resources

Scaling apps

Deployments

Endpoints

Storage

Partner Services

Other concepts

Introduction

Getting Started

1. Install the CLI

2. Initialize a Project

3. Run code remotely

4. Deploy your app

How It Works

Getting Started

Container Images

GPUs and Compute Resources

Scaling apps

Deployments

Endpoints

Storage

Partner Services

Other concepts

​Getting Started

​1. Install the CLI

​2. Initialize a Project

​3. Run code remotely

​4. Deploy your app

​How It Works

Getting Started

1. Install the CLI

2. Initialize a Project

3. Run code remotely

4. Deploy your app

How It Works