MiMo-V2-TTS Text-to-Speech: Xiaomi Voice Model Features & API

Core Text-to-Speech Features

Expressive Voice Generation

MiMo-V2-TTS is positioned for expressive audio generation rather than plain utility TTS, with support for emotional delivery and branded voice output.

Dialect and Singing Support

The source material highlights dialect control and singing support, which makes the model relevant to a wider text-to-speech search intent cluster.

API Positioning

Direct Audio Output

The MiMo platform documentation in the supplied report states that mimo-v2-tts supports direct audio stream generation through the API.

Current Pricing Signal

The report lists MiMo-V2-TTS as free for a limited time, which is important for users searching MiMo-V2-TTS API cost.

Core Text-to-Speech Features

Expressive Voice Generation

Dialect and Singing Support

API Positioning

Direct Audio Output

Current Pricing Signal

Related Pages

MiMo-V2-TTS Model Page

MiMo-V2 Homepage