Poe API
Gemini-3.1-Flash-TTS
Gemini 3.1 Flash TTS is Google’s most controllable text-to-speech model yet, designed to turn text into natural-sounding audio with precise control over style, tone, pace, and delivery. It uses new Audio Tags to make voices feel more expressive and customizable for narration, assistants, and other voice experiences.
Notes:
- Text and style prompt limited to 4,000 bytes each (8,000 bytes combined)
- Max output duration: approximately 10 minutes
- Multi-speaker requires SpeakerName: text format (example: Alice: Hi! Bob: Hello, must be on new lines)
- The model auto-detects the input language. The Language setting is a hint to help choose the right voice/accent, the model may override it if the text is in a different language.
Expressive Audio Tags:
- Use inline in your text to control delivery
- Emotion/tone: [whispers], [shouts], [laughs], [cries], [sighs], [gasps], [groans], [scoffs], [sarcasm], [deadpan], [cheerful], [sad], [angry], [fearful], [surprised], [disgusted], [confused], [nervous], [bored], [excited], [relieved], [hopeful], [proud], [shy], [sincere], [playful], [serious], [tender], [dramatic], [monotone], [warm], [cold]
- Pace/speed: [slow], [fast], [extremely fast], [extremely slow], [normal pace]
- Pauses: [short pause], [long pause], [pause], [breath]
- Emphasis/delivery: [emphasis], [softly], [loudly], [high pitch], [low pitch], [rising tone], [falling tone]
- Example: "[whispers] I have a secret. [normal pace] But first, let me explain."
This bot supports optional parameters for additional customization.
Powered by a server managed by @empiriolabsai. Learn more
- OFFICIAL
Build with Gemini-3.1-Flash-TTS using the Poe API
Start by creating an API key, for use with any bot on Poe:
See the full documentation for comprehensive guidance on getting started.
More from EmpirioLabs AI
New
MiMo-V2.5
New
MiMo-V2.5-Pro
New
Seed-2.0-Code
New
Seedream-5.0-Lite-EL
New
Qwen3.6-Max-Preview
Gemini-3.1-Flash-TTS
Seedance-2.0-Fast-EL
Seedance-2.0-Pro-EL
DeepSeek-V3.2-EL
Wan-2.7