MiMo-V2
MiMo-V2-TTS Guide

MiMo-V2-TTS Text-to-Speech: Xiaomi Voice Model Features & API

This page is built for users searching MiMo-V2-TTS text-to-speech, expressive voice features, and API output support.

Core Text-to-Speech Features

Expressive Voice Generation

MiMo-V2-TTS is positioned for expressive audio generation rather than plain utility TTS, with support for emotional delivery and branded voice output.

Dialect and Singing Support

The source material highlights dialect control and singing support, which makes the model relevant to a wider text-to-speech search intent cluster.

API Positioning

Direct Audio Output

The MiMo platform documentation in the supplied report states that mimo-v2-tts supports direct audio stream generation through the API.

Current Pricing Signal

The report lists MiMo-V2-TTS as free for a limited time, which is important for users searching MiMo-V2-TTS API cost.