Using Distill Whisper to transcribe an audio file
[cerebrium.dependencies.pip]
section of your cerebrium.toml
file:
util.py
file for our utility functions - downloading a file from a URL or converting a base64 string to a file:
main.py
with our main Python code. Users can send either a base64-encoded string or a public URL of the audio file. We’ll pass this file to our model and return the output. First, let’s define our request object:
audio
and file_url
are optional parameters, we ensure at least one is provided. The webhook_endpoint
parameter, automatically included by Cerebrium in every request, is useful for long-running requests.
Note: Cerebrium has a 3-minute timeout for each inference request. For long audio files (2+ hours) that take several minutes to process, use a webhook_endpoint
- a URL where we’ll send a POST request with your function’s results.
predict
function since this code should only run on cold start (startup). For warm containers, only the predict
function executes for inference.
predict
function, which runs only on inference requests, creates an audio file from either the download URL or base64 string, transcribes it, and returns the output.
cerebrium.toml
:
run_id
- a unique identifier to correlate the result with the initial workload.
The endpoint returns results in this format: