The fastest way to get started developing a Cerebrium deployment is to set up a template project using the cerebrium init command below. This will create a folder with all the necessary files to get you started. You can then add your code and deploy it to Cerebrium.

cerebrium init first-project

Currently, our implementation has five components:

  • main.py - This is where your Python code lives. This is mandatory to include.
  • cerebrium.toml - This is where you define all the configurations around your model such as the hardware you use, scaling parameters, deployment config, build parameters, etc. Check here for a full list

Below is an implementation of us using Pydantic in order to validate the request parameters our users send in. Please note, using Pydantic is completely optional. Every main.py you deploy needs the following mandatory layout:

from pydantic import BaseModel

class Item(BaseModel):
    prompt: str ##Pydantic validation


def predict(item, run_id, logger):
    ##with Pydantic
    item = Item(**item)
    const value = item.prompt

    ##without Pydantic
    const value = item["prompt"]

    # Do something with parameters from item

    return {"result": value}

The Item class is where you define the parameters your model receives as well as their type. Item needs to inherit from BaseModel which uses Pydantic to validate request schemas. Make sure to include pydantic as a pip requirement. You need to define a function with the name predict which receives 3 params: item, run_id and logger.

  • item: This is the expected request object containing the parameters you defined above.
  • run_id: This is a unique identifier for the user request if you want to use it to track predictions through another system
  • logger: Cerebrium supports logging via the logger (we also support “print()” statements) however, using the logger will format your logs nicer. It contains the 3 states across most loggers:
  • logger.info
  • logger.debug
  • logger.error

As long as your main.py contains the above you can write any other Python code. Import classes, add other functions etc.

Deploy model

Then navigate to where your model code (specifically your main.py) is located and run the following command:

cerebrium deploy my-first-model

Voila! Your app should start building and you should see logs of the deployment process. It shouldn’t take longer than a minute - easy peasy!

View model statistics and logs

Once you deploy a model, navigate back to the Cerebrium dashboard and click on the name of the model you just deployed. You will see the usual overview statistics of your model, but most importantly, you will see two tabs titled builds and runs.

  • Builds: This is where you can see the logs regarding the creation of your environment and the code specified in the Init function. You will see logs only on every deployment.
  • Runs: This is where you will see logs concerning every API call to your model endpoint. You can therefore debug every run based on input parameters and the model output.

Now that we have covered the basics of deploying a model, let’s dive into some of the more advanced functionality that Cortex provides.

Below are some links outlining some of the more advanced functionality that Cortex provides:

  • Custom Images: How to create your custom environments to run your ML Models.
  • Secrets: Use secrets to authenticate with third-party platforms.
  • Persistent Storage: Store model weights and files locally for faster access.
  • Long Running Tasks: Execute long running tasks in the background.
  • Streaming: Stream output live back to your endpoint