Cerebrium offers file management through a 50GB persistent volume that’s available to all apps in a project. This storage mounts at /persistent-storage and helps store model weights and files efficiently across deployments.

The /persistent-storage directory is not directly accessible during build time. You’ll need to use the Cerebrium CLI commands or the mounted volume at runtime to manage these files.

Including Files in Deployments

The cerebrium.toml configuration file controls which files become part of the app:

[cerebrium.deployment]
include = [
    "src/*.py",         # Python files in src.
    "config/*.json",    # JSON files in config.
    "requirements.txt"  # Specific files.
]

exclude = [
    "tests/*",          # Skip test files.
    "*.log"            # Skip log files.
]

Files included in deployments must be under 2GB each, with deployments working best for files under 1GB. Larger files should use persistent storage instead.

Managing Persistent Storage

The CLI provides four commands for working with persistent storage.

At runtime, the volume is mounted at /persistent-storage. When using these commands, the Cerebrium CLI does not display the /persistent-storage/ portion of the path.

  1. Upload files with cerebrium cp:

    # Upload to root directory
    cerebrium cp src_file_name.txt
    
    # Upload to specific location
    cerebrium cp src_file_name.txt dest_file_name.txt
    
    # Upload to directory
    cerebrium cp dir_name sub_folder/
  2. List files with cerebrium ls:

    # List root contents
    cerebrium ls
    
    # List specific folder
    cerebrium ls sub_folder/
  3. Remove files with cerebrium rm:

    # Remove a file
    cerebrium rm file_name.txt
    
    # Remove a directory
    cerebrium rm sub_folder/
  4. Download files with cerebrium download:

    # Download to current directory with same filename
    cerebrium download file_name.txt
    
    # Download with a different local filename
    cerebrium download file_name.txt local_file_name.txt
    
    # Download from a subdirectory
    cerebrium download sub_folder/file_name.txt

Using Stored Files

Here’s how to work with files in persistent storage:

import os
import torch

# Load a model from persistent storage.
file_path = "/persistent-storage/segment-anything/sam_vit_h_4b8939.pth"
model = torch.jit.load(file_path)

During runtime on Cerebrium, your application should access files using the full /persistent-storage/ path. For example:

# Read a file from persistent storage
with open("/persistent-storage/data/config.json", "r") as f:
    config = json.load(f)

# Write to persistent storage
with open("/persistent-storage/data/results.json", "w") as f:
    json.dump(results, f)

# Check if file exists in persistent storage
if os.path.exists("/persistent-storage/models/mymodel.pt"):
    model = torch.load("/persistent-storage/models/mymodel.pt")

Remember that while the CLI commands don’t display the /persistent-storage/ prefix in their output, your code must use the full path to access these files at runtime.

Should you require additional storage capacity, please reach out to us through support.