Mansa is a 1.4B parameter ASR model optimized for African languages, offering high transcription accuracy and low-latency performance.Some of the key features are:
African named entity recognition in English contexts
Custom spelling guidance for names and specialized terms.
Sentence or Word-level timestamps for audio up to 30 minutes (25MB)
Before you can try out our new model, you need to make sure that you’re using the latest version of our SDK.
Copy
pip install spitch==1.34.0
Ready to give it a spin? Use the code sample below to get started.
Copy
from spitch import Spitchclient = Spitch(api_key="YOUR-API-KEY")response = client.speech.transcribe( content=open("YOUR-AUDIO-FILE", "rb"), language="en", model="mansa_v1", timestamps="sentence", special_words=["Spitch"] # Add your special words here)"""RESPONSE FORMAT---------------SpeechTranscribeResponse(request_id='35580dcf-xxxx-4666-8d75-xxxxxxxxxx', text="I'm having some power issues, I won't be available for a bit. All right. Please let me know when you are back.", timestamps=[Timestamp(end=4.64, start=1.2, text="I'm having some power issues, I won't be available for a bit."), Timestamp(end=6.72, start=5.8, text=' All right.'), Timestamp(end=9.92, start=8.08, text='Please let me know when you are back.')])# TO ACCESS TRANSCRIPTprint(response.text)# TO ACCESS TIMESTAMPSprint(response.timestamps)"""
Mansa.v1 currently supports only English language. Additional language support will be available in upcoming releases.
You can use special_words to guide Mansa’s Named Entity Recognition. By entering in a list of entity strings, Mansa will accurately recognize and transcribe those terms in your audio.