Skip to main content
GPUs accelerate computational workloads through parallel processing. Originally designed for graphics rendering, modern GPUs are essential for AI models, large-scale data processing, and other compute-intensive applications. Cerebrium provides GPU access through configuration in the cerebrium.toml file, without requiring infrastructure management.

Specifying GPUs

Configure GPUs in the [cerebrium.hardware] section of cerebrium.toml, specifying the type (compute parameter) and quantity (gpu_count). Additional deployment and scaling considerations are covered in the sections below.

Available GPUs

The platform offers GPUs ranging from cost-effective development options to high-end enterprise hardware.
GPU ModelIdentifierVRAM (GB)Max GPUsPlan required
NVIDIA B300BLACKWELL_B3002628Enterprise
NVIDIA B200BLACKWELL_B2001808Enterprise
NVIDIA H200HOPPER_H2001418Enterprise
NVIDIA H100HOPPER_H100808Enterprise
NVIDIA A100AMPERE_A100_80GB808Standard
NVIDIA A100AMPERE_A100_40GB408Standard
NVIDIA L40sADA_L40488Hobby+
NVIDIA L4ADA_L4248Hobby+
NVIDIA A10AMPERE_A10248Hobby+
NVIDIA T4TURING_T4168Hobby+
AWS TrainiumTRN1328Hobby+
The identifier is used in the cerebrium.toml file. It consists of the GPU model generation and model name to avoid ambiguity.
GPU selection is also possible using the --compute and --gpu-count flags during application initialization.

Multi-GPU Configuration

Multiple GPUs are configured in the cerebrium.toml file:
[cerebrium.hardware]
compute = "AMPERE_A100_80GB"
gpu_count = 4        # Number of GPUs needed
cpu = 8
memory = 128.0
GPU availability varies by region and provider. Narrowing the provider and region constraints increases the likelihood of request queuing. For guaranteed burst capacity, contact the enterprise plan team.