Date: 2 March 2023

  • Added ability to support any Python runtime without causing dependency issues
  • Release Pygmalion 2.7b and Img2Text models
  • Now allow you to deploy any model that uses Hugging Face pipeline
  • Now pass model input to your post-processing function


Date: 20 February 2023

  • Added spaCy support! You can now use spaCy models as pipeline components.
  • Added transformers support! You can now use HuggingFace transformers models as pipeline components. At this time, we only support specific models. Consult the docs for more information.


Date: 16 February 2023

  • Fixed a critical bug causing ONNX models to not run on GPU. You should upgrade to this version if you are using ONNX models that require GPU computation.
  • onnxruntime is now an optional dependency. While you can still deploy ONNX models without the runtime, you will need to install the extra with pip install --upgrade cerebrium[onnxruntime] to test you Conduit locally.
  • Added support for onnxruntime-gpu as an optional dependency. You can install the extra with pip install --upgrade cerebrium[onnxruntime-gpu].


Date: 13 February 2023

  • Added Pre-Processing Functions! You can now specify a function to be run before the model is called.
  • Added support for the pandas library in processing functions (this is a beta feature for now, use with care!).
  • Added Arize integration. Now you can monitor your models using Arize


Date: 9 February 2023

  • Update from CUDA runtime from 11.3 to CUDA 11.6
  • Our minimum Python version is now 3.8. We now support Python 3.8, 3.9 and 3.10.
  • Myriad of dependency updates.
  • ONNX Models can now be used in multi-model flows (this is an experimental feature).
  • Bug fixes:
  • PyTorch Models now detach data from the GPU correctly when running Conduits.
  • Fixed ONNX models input data being malformed.
  • Fix CUDA driver not being loaded correctly in the Conduit runtime.
  • LLM updates
  • FLAN-T5 and Stable Diffusion no longer suffer from timeouts and severe cold starts.


Date: 13 December 2022

  • New updates to the client post 0.4.1 are now backwards compatible and now will not potentially break existing deployments. We will now support every version from 0.4.1 going forward for the foreseeable future and will not be supporting older versions of the client. Please upgrade to the latest version of the client to ensure you are using the latest features and fixes. If you want to use a new version of the client in your deployment, simply upgrade and redeploy!
  • Added the ability to define a Conduit object directly.
  • We have added external monitoring tool support! You can now monitor your deployments with Censius by adding a logger to the Conduit object. While it is available to use for all users, this functionality is currently in active development and may change in the coming weeks based on feedback. We are also working on adding support for other monitoring tools, with Arize being on track to be released in Q1 2023.
  • Fixed a bug where the client was unable to deploy SKLearn models, XGB Regressor models, and ONNX models.
  • Reworked the response signature for a model. Now returns a JSON object with 3 fields:
  • result: The data returned by the deployed Conduit.
  • run_id: The ID of the run.
  • prediction_ids: The prediction IDs of each prediction made. Used to track/update the predictions logged to monitoring tools.
  • Added pre-built deployment for the following LLMs:
  • whisper-medium
  • dreambooth
  • Added webhook support for the following LLMs:
  • dreambooth


Date: 25 November 2022

  • We made large changes to our infrastructure to be more reliable and faster. This means faster deployment, less downtime and roughly half the inference time!


Date: 23 November 2022

  • Add more models to our Prebuilt library
  • mt0-xl
  • galactica
  • flan-xl


Date: 11 November 2022

  • Added support for post-processing functions in flows!
  • Add Prebuilt models to available deployments
  • whisper-medium


Date: 28 October 2022

  • Fix a bug where the internal flow would be malformed if the model was not ONNX
  • Force single model ONNX flow until post-processing is supported


Date: 26 October 2022

  • Support for ONNX models with Python 3.7-3.9 (can be used as a single model or as the initial model of a flow). We are working on supporting Python 3.10 and 3.11!


Date: 23 October 2022

  • Model ensembles are here! You can now deploy a sequence of models to an endpoint and have them run in order.
  • Change the deploy call signature to accommodate for model ensembles.
  • Added explicit model types to be used in place of the previous plain strings.


Date: 18 October 2022

  • Reduce the minimum python version to 3.7 (and reduce dependency min versions accordingly). Now compatible with Google Colab!


Date: 17 October 2022

  • Update license
  • Handle usage limits on different Cerebrium packages


Date: 16 October 2022

  • Release on the public PyPi!
  • Refactor the package name to cerebrium


Date: 12 October 2022

  • Fixed an issue with data input shapes
  • Updated our documentation
  • Released Model/Flow Versioning
  • Improved Error Handling and Messaging on the Neuron client
  • Implemented a utility function for easy API testing
  • Provided a dry-run option for model deployment, which returns a callable function for testing
  • Fixed a deserialization issue with PyTorch models (requires cloudpickle)
  • Support for Torchscript models
  • Support for XGB JSON models


Date: 7th October 2022

  • Moved architecture to serverless CPU/GPUs
  • Support Pytorch, XGBoost and SKLearn models
  • Updated monitoring metrics per model version
  • Bug fixes and performance improvements


Date: 26th September 2022

  • Released an alpha version of Neuron.
  • Users are able to deploy models from their notebooks or .py files with just 4 lines of code.
  • Created dashboard interface for users to see deployed models, API calls, and errors.
  • Created this documentation site.