Web Analytics
Bangla
Loading date...
RECENT THREADS SOCIAL PAGE LOGIN

Mistral AI has released Voxtral TTS, its first text-to-speech model designed for multilingual, emotionally expressive voice generation. The 4-billion-parameter model delivers realistic speech in nine languages, including English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic. It supports diverse dialects, low latency, and quick adaptation to new voices, making it suitable for enterprise-grade voice agent workflows. Voxtral TTS is available for testing in Mistral Studio and through an API priced at $0.016 per 1,000 characters.

The model emphasizes contextual understanding and speaker modeling, capturing natural speech patterns such as rhythm, pauses, and emotional tone. Comparative human evaluations show Voxtral TTS achieves superior naturalness to ElevenLabs Flash v2.5 while maintaining similar latency. It can adapt to a custom voice using as little as three seconds of reference audio and demonstrates zero-shot cross-lingual voice adaptation, producing natural-sounding speech with accent transfer.

Voxtral TTS integrates with Mistral’s broader audio intelligence ecosystem, including Voxtral Transcribe, and is also available as open weights on Hugging Face under a CC BY NC 4.0 license.

Card image

Related Rumors

logo
No data found yet!

The ‘1 Nojor’ media platform is now live in beta, inviting users to explore and provide feedback as we continue to refine the experience.