Poe API

MiMo-V2-Omni

OFFICIAL

This model will be depreciated on May 31, 2026. Please switch to: https://poe.com/MiMo-V2.5 MiMo V2 Omni is Xiaomi’s omni-modal foundation model that natively sees, hears, and understands text, images, audio, and video within a single unified architecture. It combines deep reasoning and web search with agent-like capabilities—such as multi-step planning and tool use—to seamlessly power complex, real-world workflows. Notes: - Context Window: 256K - Max Output: 128K - Supported input modalities: Text, Image, Audio, Video Input Limits - Images: JPEG, PNG, GIF, WebP, BMP (max 10 MB each). Multiple images supported. - Audio: MP3, WAV, FLAC, M4A, OGG (max 100 MB each). Multiple files supported. - Video: MP4, MOV, AVI, WMV (max 300 MB each). Multiple videos supported. This bot supports optional parameters for additional customization.

Build with MiMo-V2-Omni using the Poe API

Start by creating an API key, for use with any bot on Poe:

Generate API key

See the full documentation for comprehensive guidance on getting started.