The fastest way to get started developing a Cerebrium deployment is to set up a template project using the cerebrium init command below. This will create a folder with all the necessary files to get you started. You can then add your code and deploy it to Cerebrium.

cerebrium init first-project

Currently, our implementation has five components:

  • main.py - This is where your Python code lives. This is mandatory to include.
  • cerebrium.toml - This is where you define all the configurations around your model such as the hardware you use, scaling parameters, deployment config, build parameters, etc. Check here for a full list

Below is an implementation of us using Pydantic in order to validate the request parameters our users send in. Please note, using Pydantic is completely optional. Your main.py can follow a layout similar to below but its really up to you!

def predict(prompt: dict, run_id):
    value = item.get("value", "default")
    return value

The Item class is where you define the parameters your model receives as well as their type. Item needs to inherit from BaseModel which uses Pydantic to validate request schemas. Make sure to include pydantic as a pip requirement.

You need to define a function and the name of the parameters you are sending in. The names in the function signature and in your JSON request should match exactly. Note, your url endpoint with end with the name of your function. In this case /predict.

You can also have the optional parameter called run_id in your function signature. This is the unique identifier for you request and will match that which you will see in your dashboard. If you send in a JSON payload, with a parameter called run_id, it will override it.

As long as your main.py contains the above you can write any other Python code. Import classes, add other functions etc.

Deploy model

Then navigate to where your model code (specifically your main.py) is located and run the following command:

cerebrium deploy

Voila! Your app should start building and you should see logs of the deployment process. It shouldn’t take longer than a minute - easy peasy!

View model statistics and logs

Once you deploy a model, navigate back to the Cerebrium dashboard and click on the name of the model you just deployed. You will see the usual overview statistics of your model, but most importantly, you will see two tabs titled builds and runs.

  • Builds: This is where you can see the logs regarding the creation of your environment and the code specified in the Init function. You will see logs only on every deployment.
  • Runs: This is where you will see logs concerning every API call to your model endpoint. You can therefore debug every run based on input parameters and the model output.

Now that we have covered the basics of deploying a model, let’s dive into some of the more advanced functionality that Cortex provides.

Below are some links outlining some of the more advanced functionality that Cortex provides:

  • Custom Images: How to create your custom environments to run your ML Models.
  • Secrets: Use secrets to authenticate with third-party platforms.
  • Persistent Storage: Store model weights and files locally for faster access.
  • Long Running Tasks: Execute long running tasks in the background.
  • Streaming: Stream output live back to your endpoint