Scikit
Intro
By the end of this guide, you’ll have an API endpoint that can handle any scale of traffic by running inference on serverless CPUs/GPUs.
Project set up
Before building you need to set up a Cerebrium account. This is as simple as starting a new Project in Cerebrium and copying the API key. This will be used to authenticate all calls for this project.
Create a project
- Go to dashboard.cerebrium.ai
- Sign up or Login
- Navigate to the API Keys page
- You will need your private API key for deployments. Click the copy button to copy it to your clipboard
Develop model
To start, you should install the Cerebrium framework by running the following command in your notebook or terminal
pip install --upgrade cerebrium
Now navigate to where your model code is stored. This could be in a notebook or a .py
file.
Copy and paste our code below. This creates a simple random forest classifier on the Iris dataset. This code could be replaced by any Sklearn model. Make sure you have the required libraries installed.
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
import pickle
iris = load_iris()
X, y = iris.data, iris.target
rf = RandomForestClassifier()
rf.fit(X, y)
# Save to pickle
filename = 'iris.pkl'
pickle.dump(rf, open(filename, 'wb'))
In the last line of code, you will see we pickle the file. This is all
you need to deploy your model to Cerebrium! You can then import the deploy()
function from the Cerebrium framework.
from cerebrium import deploy, model_type
name_for_your_deployment= "sk-test-model"
endpoint = deploy((model_type.SKLEARN_CLASSIFIER, "iris.pkl"), name_for_your_deployment , "<API_KEY>")
Your result format will change for the _classifier
model types. The
sklearn_classifier
will return a result
object containing the probability
distribution for the predicted output classes, rather than the argmax of the
distribution. This is to allow you flexibility in how you want to handle the
output of your model for classification. For example, you may want to return
the top 3 predictions for your model, or you may want to return the top 3
predictions with a minimum probability threshold. This is up to you.
Your model is now deployed and ready for inference all in under 10 seconds! Navigate to the dashboard and on the Models page, you will see your model.
You can run inference using curl
curl --location --request POST '<ENDPOINT>' \
--header 'Authorization: <API_KEY>' \
--header 'Content-Type: application/json' \
--data-raw '[[5.1, 3.5, 1.4, 0.2]]'
and your response should be:
Navigate back to the dashboard and click on the name of the model you just deployed. You will see an API call was made and the inference time. From your dashboard, you can monitor your model, roll back to previous versions and see traffic.
With one line of code, your model was deployed in seconds with automatic versioning, monitoring and the ability to scale based on traffic spikes. Try deploying your own model now or check out our other frameworks.
Potential Pitfalls
During your deployment, you may encounter an error along the lines of:
ValueError: Couldn't import 'worker': node array from the pickle has an incompatible dtype:
- expected: {'names': ['left_child', 'right_child', 'feature', ...
- got : [('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'), ...
This is due to a version mismatch between the version of scikit-learn used to train and that used in inference. To fix this, you can either match the version of scikit-learn on Cerebrium or you can define the version of scikit-learn in a requirements file which you parse into the conduit.
To do this, create a file called requirements.txt
and add the following line:
scikit-learn==<version>
Then, when you deploy your model, modify your conduit deploy line to include the requirements file:
from cerebrium import Conduit, model_type, hardware
c = Conduit((model_type.SKLEARN_CLASSIFIER,,"iris.pkl"), '<YOUR_MODEL_NAME>', '<API_KEY>', requirements_file='requirements.txt')
c.deploy()