cerebrium.toml
. The system automatically handles container lifecycle, networking, and scaling based on this configuration.
cerebrium init
command. This command creates a cerebrium.toml
file in the project root, which can then be edited to suit specific app requirements.
Check out the Introductory Guide for more information on how to get started.
cerebrium.toml
file to the root of your codebase, defining your entrypoint (main.py
if
using the default runtime, or adding an entrypoint to the .toml file if using
a custom runtime) and including the necessary files in the deployment
section of your cerebrium.toml
file.cerebrium.toml
file under the [cerebrium.dependencies.apt]
section as follows:
[cerebrium.dependencies.paths]
section:
nvidia/cuda
images that include Ubuntu are supported. These include the CUDA libraries necessary to provide GPU acceleration:
cerebrium.runtime.custom
section to the configuration:
entrypoint
: Command to start the app (string or string list)port
: Port the app listens onhealthcheck_endpoint
: The endpoint used to confirm instance health. If unspecified, defaults to a TCP ping on the configured port. If the health check registers a non-200 response, it will be considered unhealthy, and be restarted should it not recover timely.readycheck_endpoint
: The endpoint used to confirm if the instance is ready to receive. If unspecified, defaults to a TCP ping on the configured port. If the ready check registers a non-200 response, it will not be a viable target for request routing./cortex
—adjust paths accordingly.port
parameter.https://api.cortex.cerebrium.ai/v4/{project-id}/{app-name}/your/endpoint
.cerebrium deploy -y
—the system automatically detects and handles custom runtime configuration.