1. Cerebrium
  2. Test and Deploy

Start by importing our Python framework, Cerebrium, that allows us to abstract away the complexity of provisioning infrastructure, versioning and much more!

from cerebrium import deploy, model_type

Some model training logic...

endpoint = deploy(('<MODEL_TYPE>', '<MODEL_FILE>'), '<MODEL_NAME>', '<API_KEY>', dry_run=False)

In the deploy function there are the following parameters:

  • A tuple of the model type and the model file:
    • MODEL_TYPE: This parameter specifies the type of model you are supplying Cerebrium and must be a model_type. This is to ensure that Cerebrium knows how to handle your model. The current supported model types are:
      • model_type.SKLEARN: Expects a .pkl file (there is no requirement of the model to be a regressor or classifier).
      • model_type.SKLEARN_CLASSIFIER: Expects a .pkl file (the model must be a classifier. returns a class probability distribution instead of a single class prediction)
      • model_type.SKLEARN_PREPROCESSOR: Expects a .pkl file. This is a special model type that is used to preprocess data with the .transform method before it is sent to the model, such as a scaler or a one-hot encoder.
      • model_type.TORCH: Expects a .pkl file serialized with cloudpickle or a JIT script Torchscript .pt file.
      • model_type.XGBOOST_REGRESSOR: Expects a serialized .pkl file or a XGB .json file.
      • model_type.XGBOOST_CLASSIFIER: Expects a serialized .pkl file or a XGB .json file.
    • MODEL_FILE: This is the string path to the model object you have obtained as a result of training locally or in the cloud. This is usually some model object or model file that you have exported.
  • MODEL_NAME: The name you would like to give your model (alphanumeric, with hyphens and less than 20 characters). This is a unique identifier for your model and will be used to call your model in the future.
  • API_KEY: This is the API key that can be found on your profile. You can get it here.
  • dry_run: Boolean value for if you are running the model locally. This is an optional variable. Defaults to False.

Every unique model name will create a separate deployment with a separate endpoint. It is important to keep track of the model names you have used so that you can call the correct model in the future. If you deploy a model with the same name as a previous model, the previous model will be archived and the new model will be deployed automatically. This is useful for versioning your models.

Once you’ve run the deploy function, give it a minute, and it should be deployed - easy-peasy! If your deployment is successful, you will see the following output:

✅ Authenticated with Cerebrium!
⬆️  Uploading conduit artefacts...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 179k/179k [00:04<00:00, 42.7kB/s]
✅ Conduit artefacts uploaded successfully.
✅ Conduit deployed!
🌍 Endpoint: https://run.cerebrium.ai/YOUR-PROJECT-ID/YOUR-MODEL-NAME/predict

Our deploy function will also return the endpoint of your model directly. This is the URL that you will use to call your model in the future.

Run model locally

In order to run your model locally and ensure it is working as intended before deploying, you can use the dry_run parameter in the deploy function. This will return a Conduit object, which encompasses the logic and computational graph you can use to call your model flow locally.

from cerebrium import deploy, model_type

conduit = deploy(('<MODEL_TYPE>', '<MODEL_FILE>'), '<MODEL_NAME>', '<API_KEY>', dry_run=True)

Where data is the data you would send to your model. This would usually be some 2D/3D array for vision models or single lists for XGBoost models. You may feed an ndarray or Tensor directly into this function.

You can also define a Conduit object directly by using the Conduit class. Then call the run method on the Conduit object to test the model locally, or the deploy method to deploy the Conduit’s model flow to Cerebrium.

from cerebrium import Conduit, model_type
conduit = Conduit('<MODEL_NAME>', '<API_KEY>', [('<MODEL_TYPE>', '<MODEL_FILE>')])

Defining a conduit object directly allows you to be add more models to your flow dynamically using add_model method, as well as additionally functionality such as adding an external monitoring logger using add_logger.


API Specification and Helper methods

You can see an example of the request and response objects for calls made to your models. It should resemble what it is like calling your model locally in your own python environment

Request Parameters

  curl --location --request POST '<ENDPOINT>' \
      --header 'Authorization: <API_KEY>' \
      --header 'Content-Type: application/json' \
      --data-raw '[<DATA_INPUT>]'

This is the Cerebrium API key used to authenticate your request. You can get it from your Cerebrium dashboard.


The content type of your request. Must be application/json.


A list of data points you would like to send to your model. e.g. for 1 data point of 3 features: [[1,2,3]].

  "result": [<MODEL_PREDICTION>],
  "run_id": "<RUN_ID>",
  "prediction_ids": ["<PREDICTION_ID>"]

Response Parameters


The result of your model prediction.


The run ID associated with your model predictions.


The prediction IDs associated with each of your model predictions. Used to track your model predictions with monitoring tools.

You can test out your model endpoint quickly with our utility function supplied in Cerebrium, model_api_request.

from cerebrium import model_api_request
model_api_request(endpoint, data, '<API_KEY>')

The function takes in the following parameters:

  • endpoint: The endpoint of your model that was returned by the deploy function.
  • data: The data you would like to send to your model. You may feed an ndarray or Tensor directly into this function.
  • api_key: This is the Cerebrium API key used to authenticate your request.

Your result format will change for the _classifier model types. Both sklearn_classifier and xgb_classifier will return a result object containing the probability distribution for the predicted output classes, rather than the argmax of the distribution. This is to allow you flexibility in how you want to handle the output of your model for classification. For example, you may want to return the top 3 predictions for your model, or you may want to return the top 3 predictions with a minimum probability threshold. This is up to you.

If you want to see concrete examples of various model deployments head on over to the Examples page.