- Cerebrium
- Test and Deploy
To deploy your machine learning model to Cerebrium takes just 2 lines of code.
Start by importing our Python framework, Cerebrium, that allows us to abstract away the complexity of provisioning infrastructure, versioning and much more!
from cerebrium import deploy, model_type
Some model training logic...
endpoint = deploy(('<MODEL_TYPE>', '<MODEL_FILE>'), '<MODEL_NAME>', '<API_KEY>', dry_run=False)
In the deploy function there are the following parameters:
- A tuple of the model type and the model file:
- MODEL_TYPE: This parameter specifies the type of model you are supplying
Cerebrium and must be a
model_type
. This is to ensure that Cerebrium knows how to handle your model. The current supported model types are:- model_type.SKLEARN: Expects a
.pkl
file (there is no requirement of the model to be a regressor or classifier). - model_type.SKLEARN_CLASSIFIER: Expects a
.pkl
file (the model must be a classifier. returns a class probability distribution instead of a single class prediction) - model_type.SKLEARN_PREPROCESSOR: Expects a
.pkl
file. This is a special model type that is used to preprocess data with the.transform
method before it is sent to the model, such as a scaler or a one-hot encoder. - model_type.TORCH: Expects a
.pkl
file serialized withcloudpickle
or a JIT script Torchscript.pt
file. - model_type.XGBOOST_REGRESSOR: Expects a serialized
.pkl
file or a XGB.json
file. - model_type.XGBOOST_CLASSIFIER: Expects a serialized
.pkl
file or a XGB.json
file.
- model_type.SKLEARN: Expects a
- MODEL_FILE: This is the string path to the model object you have obtained as a result of training locally or in the cloud. This is usually some model object or model file that you have exported.
- MODEL_TYPE: This parameter specifies the type of model you are supplying
Cerebrium and must be a
- MODEL_NAME: The name you would like to give your model (alphanumeric, with hyphens and less than 20 characters). This is a unique identifier for your model and will be used to call your model in the future.
- API_KEY: This is the API key that can be found on your profile. You can get it here.
- dry_run: Boolean value for if you are running the model locally. This is an optional variable. Defaults to False.
Every unique model name will create a separate deployment with a separate endpoint. It is important to keep track of the model names you have used so that you can call the correct model in the future. If you deploy a model with the same name as a previous model, the previous model will be archived and the new model will be deployed automatically. This is useful for versioning your models.
Once you’ve run the deploy
function, give it a minute, and it should be
deployed - easy-peasy! If your deployment is successful, you will see the
following output:
✅ Authenticated with Cerebrium!
⬆️ Uploading conduit artefacts...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 179k/179k [00:04<00:00, 42.7kB/s]
✅ Conduit artefacts uploaded successfully.
✅ Conduit deployed!
🌍 Endpoint: https://run.cerebrium.ai/YOUR-PROJECT-ID/YOUR-MODEL-NAME/predict
Our deploy
function will also return the endpoint of your model directly. This
is the URL that you will use to call your model in the future.
Run model locally
In order to run your model locally and ensure it is working as intended before
deploying, you can use the dry_run
parameter in the deploy
function. This will return a Conduit object,
which encompasses the logic and computational graph you can use to call your model flow locally.
from cerebrium import deploy, model_type
conduit = deploy(('<MODEL_TYPE>', '<MODEL_FILE>'), '<MODEL_NAME>', '<API_KEY>', dry_run=True)
conduit.run(data)
Where data
is the data you would send to your model. This would usually be
some 2D/3D array for vision models or single lists for XGBoost models. You may
feed an ndarray
or Tensor
directly into this function.
You can also define a Conduit object directly by using the Conduit
class. Then call the run
method on the Conduit object to test the model locally, or the deploy
method to deploy the Conduit’s model flow to Cerebrium.
from cerebrium import Conduit, model_type
conduit = Conduit('<MODEL_NAME>', '<API_KEY>', [('<MODEL_TYPE>', '<MODEL_FILE>')])
conduit.run(data)
conduit.deploy()
Defining a conduit object directly allows you to be add more models to your flow dynamically using add_model
method, as well as
additionally functionality such as adding an external monitoring logger using add_logger
.
conduit.add_model('<MODEL_TYPE>', '<MODEL_FILE>', '<POSTPROCESSING_FUNCTION>')
API Specification and Helper methods
You can see an example of the request and response objects for calls made to your models. It should resemble what it is like calling your model locally in your own python environment
Request Parameters
curl --location --request POST '<ENDPOINT>' \
--header 'Authorization: <API_KEY>' \
--header 'Content-Type: application/json' \
--data-raw '[<DATA_INPUT>]'
This is the Cerebrium API key used to authenticate your request. You can get it from your Cerebrium dashboard.
The content type of your request. Must be application/json.
A list of data points you would like to send to your model. e.g. for 1 data point of 3 features: [[1,2,3]].
{
"result": [<MODEL_PREDICTION>],
"run_id": "<RUN_ID>",
"prediction_ids": ["<PREDICTION_ID>"]
}
Response Parameters
The result of your model prediction.
The run ID associated with your model predictions.
The prediction IDs associated with each of your model predictions. Used to track your model predictions with monitoring tools.
You can test out your model endpoint quickly with our utility function supplied
in Cerebrium, model_api_request
.
from cerebrium import model_api_request
model_api_request(endpoint, data, '<API_KEY>')
The function takes in the following parameters:
- endpoint: The endpoint of your model that was returned by the
deploy
function. - data: The data you would like to send to your model. You may
feed an
ndarray
orTensor
directly into this function. - api_key: This is the Cerebrium API key used to authenticate your request.
Your result format will change for the _classifier
model types. Both
sklearn_classifier
and xgb_classifier
will return a result
object
containing the probability distribution for the predicted output classes,
rather than the argmax of the distribution. This is to allow you flexibility
in how you want to handle the output of your model for classification. For
example, you may want to return the top 3 predictions for your model, or you
may want to return the top 3 predictions with a minimum probability threshold.
This is up to you.
If you want to see concrete examples of various model deployments head on over to the Examples page.
curl --location --request POST '<ENDPOINT>' \
--header 'Authorization: <API_KEY>' \
--header 'Content-Type: application/json' \
--data-raw '[<DATA_INPUT>]'
{
"result": [<MODEL_PREDICTION>],
"run_id": "<RUN_ID>",
"prediction_ids": ["<PREDICTION_ID>"]
}