Deploying Model Ensembles
With Cerebrium you are also able to deploy a sequence of models simply by specifying a list of tuples of model types and model files. Currently, this functionality is supported across all model types, with the caveat that we assume the models are evaluated in the order they are supplied. If you require that data be formatted in between models, please use pre/post processing functions.
For example, if you have a PyTorch model that takes in a 2D image and outputs a 1D vector, and
you have a XGB model that takes in a 1D vector and outputs a single class
prediction, you can deploy them as a sequence of models by supplying the
following to the deploy
function:
model_flow = [(model_type.TORCH, 'torch.pt'), (model_type.XGBOOST_CLASSIFIER, 'xgb.json')]
endpoint = deploy(model_flow, 'my-flow', "<API_KEY>")
Unfortunately, we don’t allow you to integrate endpoints or our prebuilt models into these flows - the benefit of our ensemble implementation is that everything is run on the same machine so execution is extremely quick, and you don’t have multiple network requests. If you would like more control over implementing model ensembles or to make network requests, we recommend you look at deploying your own custom python code