Web Analytics
Bangla
Loading date...
RECENT THREADS SOCIAL PAGE LOGIN

Mistral AI has released Voxtral TTS, its first text-to-speech model designed for multilingual, emotionally expressive voice generation. The 4-billion-parameter model delivers realistic speech in nine languages, including English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic. It supports diverse dialects, low latency, and quick adaptation to new voices, making it suitable for enterprise-grade voice agent workflows. Voxtral TTS is available for testing in Mistral Studio and through an API priced at $0.016 per 1,000 characters.

The model emphasizes contextual understanding and speaker modeling, capturing natural speech patterns such as rhythm, pauses, and emotional tone. Comparative human evaluations show Voxtral TTS achieves superior naturalness to ElevenLabs Flash v2.5 while maintaining similar latency. It can adapt to a custom voice using as little as three seconds of reference audio and demonstrates zero-shot cross-lingual voice adaptation, producing natural-sounding speech with accent transfer.

Voxtral TTS integrates with Mistral’s broader audio intelligence ecosystem, including Voxtral Transcribe, and is also available as open weights on Hugging Face under a CC BY NC 4.0 license.

Card image

News Source

mistral.ai 29 Mar 26

Speaking of Voxtral | Mistral AI

Voxtral Text-to-Speech Voxtral Text-to-Speech Today we’re releasing Voxtral TTS, our first text-to-speech model with state-of-the-art performance in multilingual voice generation. The model is lightweight at 4B parameters, making Voxtral-powered agen


The ‘1 Nojor’ media platform is now live in beta, inviting users to explore and provide feedback as we continue to refine the experience.