Cerebrium caches your models in a specific directory on the server. This cache persists across your project, meaning that if you deploy a new model with the same name or use the same HuggingFace model, it will be loaded from the cache rather than downloaded from the cloud. This allows us to scale your model deployments to handle more requests quickly, as well as reduce the size of your deployment container images. Currently, the cache can be accessed through /persistent-storage in your container instance, should you wish to access it directly and store other artefacts. While you have full access to this drive, we recommend that you only store files in directories other than /persistent-storage/cache, as this and its subdirectories are used by Cerebrium to store your models. In the future, we will charge per GB of persistent storage used, but for now, it is free while we are in active development. We will also be implementing a way to clear the cache in the future, as well as pre-uploading models to the cache before deployment.

As a simple example, suppose you have an external SAM model that you want to use in your custom deployment. You can download it to the cache as such:

import os
import torch

file_path = "/persistent-storage/segment-anything/model.pt"
# Check if the file already exists, if not download it
if not os.path.exists("/persistent-storage/segment-anything/"):
    response = requests.get("https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth")
    with open(file_path, "wb") as f:

# Load the model
model = torch.jit.load(file_path)
... # Continue with your initialization

Now, in subsequent deployments, the model will load from the cache rather than download it again.

Increasing your Persistent Storage Size

Once increased, your persistent storage size cannot be decreased.

By default, your account is given 50GB of persistent storage to start with. However, if you find you need more (for example, you get an error saying disk quote exceeded) then you can increase your allocation using the following steps:

  1. Check your current persistent storage allocation by running:
cerebrium storage --get-capacity

This will return your current persistent storage allocation in GB.

  1. To increase your persistent storage allocation run:
cerebrium storage --increase-in-gb <number of GB to increase by>

This will return a confirmation message and your new persistent storage allocation in GB if successful.