Request
Thetranscribe()
function can be used to transcribe audio. Pass either a url
or a content
to the transcribe function.
Parameters
language
(required) - Language code:en
,yo
,ha
,ig
,am
content
(optional) - Audio file content (binary data)url
(optional) - URL to the audio filemodel
(optional) - STT model to use:mansa_v1
,legacy
,human
. Defaults tolegacy
special_words
(optional) - Custom words to help with recognition accuracytimestamp
(optional) - Timestamp granularity:sentence
,word
,none
. Defaults tonone
You must provide either
content
or url
, but not both.STT Models
Themodel
parameter allows you to choose different speech-to-text models:
legacy
(default) - Standard transcription modelmansa_v1
- Enhanced model with better accuracy for African languageshuman
- High-quality model optimized for human speech patterns
Timestamp Options
Thetimestamp
parameter controls the level of timing information returned:
none
(default) - No timestamp informationsentence
- Timestamps for each sentenceword
- Timestamps for each individual word
Best Practices for Use
- You can provide either the
content
(file) orurl
(str), but do not provide both. - The maximum file size is 25MB, we will support larger sizes in the future.
- We only support
mp3
,wav
,m4a
, andogg
file formats. - If you provide
url
, ensure that access to the file is not blocked by authentication. - When transcribing, you should use the language code (e.g.
en
,yo
,ig
) and not the full text.
Response
The response for speech generation is in bytes.- The Content-Type is
application/json
- A
request_id
is returned for issue resolution with our support team.
Language Support
Our speech-to-text model supports the following languages:- Hausa:
ha
- Igbo:
ig
- Yoruba:
yo
- Amharic:
am
- English:
en
Examples - file
Python
Examples - url
Python