1. Quickstarts
  2. Spacy


spaCy is an open-source software library for advanced natural language processing and excels at large-scale information extraction tasks By the end of this guide you’ll have a spaCy API endpoint that can handle any scale of traffic by running inference on serverless CPU’s/GPU’s.

Project set up

Before building you need to set up a Cerebrium account. This is as simple as starting a new Project in Cerebrium and copying the API key. This will be used to authenticate all calls for this project.

Create a project

  1. Go to dashboard.cerebrium.ai
  2. Signup or Login
  3. Navigate to the API Keys page
  4. You should see an API key with the source “Cerebrium”. Click the eye icon to display it. It will be in the format: c_api_key-xxx


Prepare model

We are going to be creating a custom named entity recognition (NER) model. This code could be replaced by any Spacy model. Make sure you have the required libraries installed - in this case we are just using Spacy 3.

First let us create our training and evaluation data:

    "Who is Shaka Khan?",
    "Who is Michael?",
Spacy 3 no longer accepts JSON format so we have to convert it to their Spacy format
import srsly
import typer
import warnings
from pathlib import Path

import spacy
from spacy.tokens import DocBin

def convert(lang: str, input_path: Path, output_path: Path):
    nlp = spacy.blank(lang)
    db = DocBin()
    for text, annot in srsly.read_json(input_path):
        doc = nlp.make_doc(text)
        ents = []
        for start, end, label in annot["entities"]:
            span = doc.char_span(start, end, label=label)
            if span is None:
                msg = f"Skipping entity [{start}, {end}, {label}] in the following text because the character span '{doc.text[start:end]}' does not align with token boundaries:\n\n{repr(text)}\n"
        doc.ents = ents

if __name__ == "__main__":

Now run this script with the location of your training data and again with your evaluation data. !python convert.py en train.json train.spacy

When creating a Spacy model we can start with the standard template and edit it. You can run the following code to do this: !python -m spacy init config --lang en --pipeline ner config.cfg --force

Lastly we can train our model: !python -m spacy train config.cfg --output ./training/ --paths.train train.spacy --paths.dev eval.spacy --training.eval_frequency 10 --training.max_steps 100 --gpu-id -1

Developing model

To start you should install the Cerebrium framework running the following command in your notebook or terminal

pip install cerebrium

One of the main differences with a spaCy model deployed on Cerebrium, is it is mandatory to include a post-processing function which uses your trained model to manipulate text. This post-processing function has to return a list or string in order to be accepted. This is what will be returned to you via the API. Below we show a very basic post-processing function and deploy our spaCy model.

When you train a spaCy model, it will create a folder with a tokenizer, meta.json, config.cfg etc. You will need to specify the path of this folder when deploying your model.

At the moment we only support Spacy Models that have been serialised on Python 3.10.
from cerebrium import deploy, model_type

###doc is the model returned from spacy.load()
def postSpacy(doc):
    test = []
    for token in doc:
    return test

endpoint = deploy((model_type.SPACY, "path/to/trained/model/", {"post": postSpacy}), "spacy-model" , "<API_KEY>")

Deployed Model

Your model is now deployed and ready for inference all in under 10 seconds! Navigate to the dashboard and on the Models page you will see your model.

You can run inference using curl

curl --location --request POST '<ENDPOINT>' \
--header 'Authorization: <API_KEY>' \
--header 'Content-Type: application/json' \
--data-raw '["This is us testing the Spacy model]'

The response will be:

Spacy Postman Response

Navigate back to the dashboard and click on the name of the model you just deployed. You will see an API call was made and the inference time. From your dashboard you can monitor your model, roll back to previous versions and see traffic.

With one line of code, your model was deployed in seconds with automatic versioning, monitoring and the ability to scale based on traffic spikes. Try deploying your own model now or check out our other frameworks.