Poe API
Gemini-2.5-Flash-TTS
Gemini‑2.5‑Flash‑TTS is Google’s low‐latency text‑to‑speech model that converts text input into audio output, supporting both single‑ and multi‑speaker voices with controllable style, accent, and expressive tone — ideal for applications like podcasts, audiobooks, and conversational voice systems.
Notes:
- Text and style prompt limited to 4,000 bytes each (8,000 bytes combined)
- Max output duration: approximately 10 minutes
- Multi-speaker requires SpeakerName: text format (example: Alice: Hi! Bob: Hello, must be on new lines)
- The model auto-detects the input language. The Language setting is a hint to help choose the right voice/accent, the model may override it if the text is in a different language.
This bot supports optional parameters for additional customization.
Powered by a server managed by @empiriolabsai. Learn more
Build with Gemini-2.5-Flash-TTS using the Poe API
Start by creating an API key, for use with any bot on Poe:
See the full documentation for comprehensive guidance on getting started.
More from EmpirioLabs AI
New
Qwen3.7-Plus
New
MiniMax-M3-EL
New
Grok-Imgn-Video-1.5
Qwen3.7-Max
Qwen3.5-4B-EL
Qwen3.5-9B-EL
Gemma-4-26B-A4B-EL
DeepSeek-V4-Pro-E
DeepSeek-V4-Flash-E
DeepSeek-V4-Pro-EL