CUDA (Compute Unified Device Architecture) enables apps to use graphics cards (GPUs) to speed up calculations. Unlike standard processors (CPUs) that handle one task at a time, graphics cards can handle many tasks simultaneously.
CUDA connects apps to graphics cards, splitting large tasks into smaller pieces that can be processed at the same time. This makes operations like image processing and complex mathematical calculations much faster than using a standard processor alone.
Some apps need direct access to CUDA system libraries and tools. The CUDA base image provides this complete CUDA toolkit environment. This is often necessary when apps:
Compile custom CUDA code
Access low-level CUDA features
Need specific CUDA driver versions
Require CUDA development tools
The base image can be set in the cerebrium.toml file:
Image size and complexity significantly impact cold-start performance - the time needed to initialize an app from an inactive state. While Cerebrium uses a content-addressable file system that selectively pulls only required files, larger images still affect startup times.
Cold-start optimization often involves balancing competing needs:
Copy
Ask AI
# Minimal runtime image - faster cold-startsdocker_base_image_url = "debian:bookworm-slim"# Full development image - slower cold-starts, more toolsdocker_base_image_url = "nvidia/cuda:12.0.1-devel-ubuntu22.04"
Some apps might benefit from keeping instances warm to avoid cold-starts entirely, though this affects resource usage.For a practical implementation, see the Stable Diffusion Example.