GPT4All
GPT4All is a 7B param language model that is fine tuned from a curated set of 400k GPT-Turbo-3.5 assistant-style generation. - you can read more here. In order to deploy it you can use the identifier below:
- GPT4All:
gpt4-all
Here’s an example of how to call the deployed endpoint:
Request Parameters
curl --location --request POST 'https://run.cerebrium.ai/gpt4-all-webhook/predict' \
--header 'Authorization: <API_KEY>' \
--header 'Content-Type: application/json' \
--data-raw '{
"prompt": "Tell me a story of a tortoise",
"num_beams": 2,
"min_new_tokens": 10,
"max_length": 100,
"repetition_penalty": 2.0
}'
This is the Cerebrium API key used to authenticate your request. You can get it from your Cerebrium dashboard.
The prompt you would like GPT4ALL to process.
The larger the beam size the higher the likelihood of finding a better solution but it also increases the computational complexity and inference time.
The minimum number of tokens the model must generate. The
The maximum number of tokens the model must generate. The default is 100.
parameter used in text generation to penalize repeated words or tokens in the generated text. This is a number greater than or equal to 1.
{
"run_id": "dc8f23ab-7237-42dc-b6cf-430abdbba8f7",
"run_time_ms": 6077.8913497924805,
"message": "Ran successfully",
"result": ".\nThe tortoise was lying on the ground, its shell glistening in the sunlight. It looked up at me with its beady black eyes, and I felt a pang of sympathy for the poor creature. Its legs were tucked underneath its body, and it seemed to be struggling to move. I knelt down next to it and reached out a hand to touch its shell. The tortoise recoiled"
}
Response Parameters
A unique identifier for the run that you can use to associate prompts with webhook endpoints.
The amount of time in millisecond it took to run your function. This is what you will be billed for.
Whether of not the response was successful
The result generated from GPT4All
curl --location --request POST 'https://run.cerebrium.ai/gpt4-all-webhook/predict' \
--header 'Authorization: <API_KEY>' \
--header 'Content-Type: application/json' \
--data-raw '{
"prompt": "Tell me a story of a tortoise",
"num_beams": 2,
"min_new_tokens": 10,
"max_length": 100,
"repetition_penalty": 2.0
}'
{
"run_id": "dc8f23ab-7237-42dc-b6cf-430abdbba8f7",
"run_time_ms": 6077.8913497924805,
"message": "Ran successfully",
"result": ".\nThe tortoise was lying on the ground, its shell glistening in the sunlight. It looked up at me with its beady black eyes, and I felt a pang of sympathy for the poor creature. Its legs were tucked underneath its body, and it seemed to be struggling to move. I knelt down next to it and reached out a hand to touch its shell. The tortoise recoiled"
}