Request
Thegenerate() function can be used to generate speech. The following parameters are available:
Parameters
text(required) - The text you want to convert to speechlanguage(required) - Language code:en,yo,ha,ig,amvoice(required) - Voice ID to use for generationmodel(optional) - Model to use, defaults tolegacyformat(optional) - Output audio format, defaults towav
Audio Formats
Theformat parameter supports the following audio formats:
wav(default) - Standard wave formatmp3- MPEG Layer III audioogg_opus- OGG container with Opus codecwebm_opus- WebM container with Opus codecflac- Free Lossless Audio Codecpcm_s16le- Raw PCM 16-bit little-endianmulaw- μ-law encoded audioalaw- A-law encoded audio
Best Practices for Use
- We highly recommend that you perform tone-marking first before TTS. This allows the model to pronounce the words properly during speech generation.
- Make sure your text has correct punctuation before sending it for speech generation to achieve more natural and accurate output.
- Not all voices work for all languages. Ensure you select the voice that matches the language of your choice. More info on voices can be found on the Voices page
Response
The response for speech generation is in bytes.- The Content-Type is
audio/wav - The content is streamed back to the caller.
- The file type of the generated audio is
wav. If you use the streaming interface (Python SDK), you can start to take action on the byte chunks, e.g. stream to file.
Choosing a Voice
We currently have 8 characters with unique voices for the supported languages. Each of these characters has unique attributes, we think you will find them fun to use. Feel free to try them out and let us know which one you love the most. 😉Language Support
The speech generation model supports the following languages:- English:
en - Hausa:
ha - Igbo:
ig - Amharic:
am - Yoruba:
yo
Examples
Python

