Skip to main content
Crafting strong prompts is the fastest way to unlock natural-sounding speech with Spitch. The guidance below highlights what to include (and avoid) so the model hits the right voice, tone, and pronunciation every time.

1. Start with intent + audience

Open your prompt with a short description that gives the listener context, then supply the exact script you want spoken. Spitch voices read the text verbatim—avoid meta-instructions like “speak slowly” or “sound excited,” and instead encode those cues directly into the words and punctuation.
Bedtime story for kids in Lagos.
Once upon a time in Ikorodu... it was the quietest night the twins had ever heard.

2. Select the right voice for the use case

All voices are production-ready, but each carries its own timbre, pacing, and energy. Match the voice to the job rather than using a single default everywhere. Tip: audition two or three voices for each new product surface, then lock the one that resonates most with your users. You can preview every voice on the Voices catalog and play sample audio directly in your browser.

3. Sprinkle in human disfluencies (sparingly)

Short hesitations such as “uh”, “um”, “hmm”, or “ah” can make generated speech feel conversational. Use them intentionally:
  • Add them only where a real speaker would pause (e.g., thinking, switching topics).
  • Combine with ellipses ... to signal a brief pause.
  • Avoid stacking multiple disfluencies together; clarity drops quickly.
Example:
Um… I can get that report over to you by 4 PM, no problem.

4. Clarify acronyms and abbreviations

Spell acronyms in uppercase and add spaces between letters so the model pronounces each character instead of guessing a word.
We’ll begin the briefing with the G R E overview, followed by NASA’s update.
You can mix both approaches for long names—keep the acronym spaced out on first mention, then revert to the standard form later in the script.

5. Keep scripts clean and well-punctuated

  • Use complete sentences and commas where a human would naturally pause.
  • Provide numbers in the format you want spoken (10,000 vs ten thousand).
  • Remove screen directions the listener should not hear, or wrap them in brackets and clarify they are stage guidance.

6. Iterate quickly

Small prompt edits yield noticeable differences. When prototyping:
  1. Draft a prompt with intent, script, and any delivery cues.
  2. Generate audio in Studio or via SDK.
  3. Log what you changed and the result (tone, expressiveness, clarity).
  4. Reuse high-performing patterns across similar experiences.

Checklist before shipping

Pair this guide with the Speech Generation docs when you are ready to automate prompt creation or run bulk TTS jobs.