Quickstart
Get up and running with your first deployed model on Cerebrium
The fastest way to get started developing a Cerebrium deployment is to set up a template project using the cerebrium init
command below. This will create a folder with all the necessary files to get you started. You can then add your code and deploy it to Cerebrium.
cerebrium init first-project
Currently, our implementation has five components:
- main.py - This is where your Python code lives. This is mandatory to include.
- cerebrium.toml - This is where you define all the configurations around your model such as the hardware you use, scaling parameters, deployment config, build parameters, etc. Check here for a full list
Your main.py can follow a layout similar to below but its really up to you!
def run(param: str, run_id):
return {"message": f"Running {param} remotely on Cerebrium!"}
You need to define a function and the name of the parameters you are sending in. The names in the function signature and in your JSON request should match exactly. Note, your URL endpoint with end with the name of your function. In this case /run.
You can also have the optional parameter called run_id in your function signature. This is the unique identifier for you request and will match that which you will see in your dashboard. If you send in a JSON payload, with a parameter called run_id, it will override it.
As long as your main.py contains the above you can write any other Python code. Import classes, add other functions etc.
Deploy model
Then navigate to where your model code (specifically your main.py
) is located and run the following command:
cerebrium deploy
Voila! Your app should start building and you should see logs of the deployment process. It shouldn’t take longer than a minute - easy peasy!
View model statistics and logs
Once you deploy a model, navigate back to the Cerebrium dashboard and click on the name of the model you just deployed. You will see the usual overview statistics of your model, but most importantly, you will see two tabs titled builds and runs.
- Builds: This is where you can see the logs regarding the creation of your environment and the code specified in the Init function. You will see logs only on every deployment.
- Runs: This is where you will see logs concerning every API call to your model endpoint. You can therefore debug every run based on input parameters and the model output.
Now that we have covered the basics of deploying a model, let’s dive into some of the more advanced functionality that Cortex provides.
Below are some links outlining some of the more advanced functionality that Cortex provides:
- Custom Images: How to create your custom environments to run your ML Models.
- Secrets: Use secrets to authenticate with third-party platforms.
- Persistent Storage: Store model weights and files locally for faster access.
- Streaming: Stream output live back to your endpoint