Mansa.v1

Mansa is a 1.4B parameter ASR model optimized for African languages, offering high transcription accuracy and low-latency performance. Some of the key features are:

African named entity recognition in English contexts
Custom spelling guidance for names and specialized terms.
Sentence or Word-level timestamps for audio up to 30 minutes (25MB)

Before you can try out our new model, you need to make sure that you’re using the latest version of our SDK.

pip install spitch==1.34.0

Ready to give it a spin? Use the code sample below to get started.

from spitch import Spitch
client = Spitch(api_key="YOUR-API-KEY")

response = client.speech.transcribe(
    content=open("YOUR-AUDIO-FILE", "rb"), 
    language="en", 
    model="mansa_v1",
	timestamps="sentence",
	special_words=["Spitch"] # Add your special words here
)

"""
RESPONSE FORMAT
---------------
SpeechTranscribeResponse(
request_id='35580dcf-xxxx-4666-8d75-xxxxxxxxxx', 
text="I'm having some power issues, I won't be available for a bit.  All right. Please let me know when you are back.", 
timestamps=[Timestamp(end=4.64, start=1.2, text="I'm having some power issues, I won't be available for a bit."), Timestamp(end=6.72, start=5.8, text=' All right.'), Timestamp(end=9.92, start=8.08, text='Please let me know when you are back.')])

# TO ACCESS TRANSCRIPT
print(response.text)

# TO ACCESS TIMESTAMPS
print(response.timestamps)
"""

Mansa.v1 currently supports only English language. Additional language support will be available in upcoming releases.

model

string

‘mansa_v1’

timestamp

string

‘sentence’, ‘word’, ‘none’

special_words

string

You can use special_words to guide Mansa’s Named Entity Recognition. By entering in a list of entity strings, Mansa will accurately recognize and transcribe those terms in your audio.

GETTING STARTED

FEATURES

CONCEPTS

Spitch Models

Mansa.v1

GETTING STARTED

FEATURES

CONCEPTS

​Mansa.v1

Mansa.v1