The French artificial intelligence startup Mistral AI has introduced an open speech recognition model named Voxtral, entering the competitive audio market.
Introducing the world’s best (and open) speech recognition models! pic.twitter.com/tUnPcdCrbZ
— Mistral AI (@MistralAI) July 15, 2025
The tool is designed for business use and aims to integrate into production processes. It is positioned as a solution for creating truly practical speech intelligence.
In other words, developers are no longer expected to choose between:
- a cheap and open system that poorly transcribes and fails to understand speech;
- a well-functioning but closed and more expensive model.
The company claims that Voxtral offers an affordable alternative, costing “less than half” compared to its counterparts.
The model can transcribe up to 30 minutes of audio and understand up to 40 minutes, allowing users to ask questions about the content, create summaries, or turn voice commands into actions such as calling an API or launching functions in real time.
Voxtral supports multiple languages, including English, Spanish, French, Portuguese, Hindi, German, Dutch, and Italian.
The company offers two versions of the neural network:
- Voxtral Small — contains 24 billion parameters and is intended for production-scale deployment;
- Voxtral Mini — has 3 billion parameters and is suitable for local deployments.
Additionally, there is an ultra-cheap, simplified, and fast version called Voxtral Mini Transcribe, optimized solely for transcription.
Voxtral can be tested for free on Hugging Face or in the Le Chat chatbot. API integration costs from $0.001 per minute.
Mistral AI is considered a flagship AI startup in Europe, capable of competing with American and Chinese firms. In February, it released a mobile application for iOS and Android.
Earlier, the European leader in artificial intelligence announced plans for an IPO.
