Skip to main content
Cerebrium’s default runtime covers most app needs. For more control, use ASGI or WSGI servers through the custom runtime feature - enabling custom authentication, dynamic batching, frontend dashboards, public endpoints, and WebSocket connections.

Setting Up Custom Servers

A basic FastAPI server running as a custom server on Cerebrium:
from fastapi import FastAPI
app = FastAPI()

@app.post("/hello")
def hello():
    return {"message": "Hello Cerebrium!"}

@app.get("/health")
def health():
    return "OK"

@app.get("/ready")
def ready():
    return "OK"
Configure this server in cerebrium.toml by adding a custom runtime section:
[cerebrium.runtime.custom]
port = 5000
entrypoint = ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "5000"]
healthcheck_endpoint = "/health"
readycheck_endpoint = "/ready"

[cerebrium.dependencies.pip]
pydantic = "latest"
numpy = "latest"
loguru = "latest"
fastapi = "latest"
The configuration requires three key parameters:
  • entrypoint: The command that starts your server
  • port: The port your server listens on
  • healthcheck_endpoint: The endpoint used to confirm instance health. If unspecified, defaults to a TCP ping on the configured port. If the health check registers a non-200 response, it will be considered unhealthy, and be restarted should it not recover timely.
  • readycheck_endpoint: The endpoint used to confirm if the instance is ready to receive. If unspecified, defaults to a TCP ping on the configured port. If the ready check registers a non-200 response, it will not be a viable target for request routing.
For ASGI applications like FastAPI, include the appropriate server package (like uvicorn) in your dependencies. After deployment, your endpoints become available at https://api.aws.us-east-1.cerebrium.ai/v4/[project-id]/[app-name]/your/endpoint.
The FastAPI Server Example provides a complete implementation.

Request Headers

Custom web servers receive the Cerebrium run ID in the X-Request-Id header on every request. This corresponds to the internal run_id and is useful for tracking and debugging.