Language Models
GPT4All

GPT4All is a 7B param language model that is fine tuned from a curated set of 400k GPT-Turbo-3.5 assistant-style generation. - you can read more here. In order to deploy it you can use the identifier below:

  • GPT4All: gpt4-all

Here’s an example of how to call the deployed endpoint:

Request Parameters

  curl --location --request POST 'https://run.cerebrium.ai/gpt4-all-webhook/predict' \
      --header 'Authorization: <API_KEY>' \
      --header 'Content-Type: application/json' \
      --data-raw '{
        "prompt": "Tell me a story of a tortoise",
        "num_beams": 2,
        "min_new_tokens": 10,
        "max_length": 100,
        "repetition_penalty": 2.0
    }'
Authorizationrequired
string

This is the Cerebrium API key used to authenticate your request. You can get it from your Cerebrium dashboard.

promptrequired
string

The prompt you would like GPT4ALL to process.

num_beamsoptional
string

The larger the beam size the higher the likelihood of finding a better solution but it also increases the computational complexity and inference time.

min_new_tokensoptional
string

The minimum number of tokens the model must generate. The

max_lengthoptional
string

The maximum number of tokens the model must generate. The default is 100.

repetition_penaltyrequired
string

parameter used in text generation to penalize repeated words or tokens in the generated text. This is a number greater than or equal to 1.

{
    "run_id": "dc8f23ab-7237-42dc-b6cf-430abdbba8f7",
    "run_time_ms": 6077.8913497924805,
    "message": "Ran successfully",
    "result": ".\nThe tortoise was lying on the ground, its shell glistening in the sunlight. It looked up at me with its beady black eyes, and I felt a pang of sympathy for the poor creature. Its legs were tucked underneath its body, and it seemed to be struggling to move. I knelt down next to it and reached out a hand to touch its shell. The tortoise recoiled"
}

Response Parameters

run_idrequired
string

A unique identifier for the run that you can use to associate prompts with webhook endpoints.

run_time_msrequired
string

The amount of time in millisecond it took to run your function. This is what you will be billed for.

messagerequired
string

Whether of not the response was successful

resultrequired
string

The result generated from GPT4All