
OpenAI Unveils Alpha Version of Advanced Voice Assistant
OpenAI has launched an alpha version of the advanced voice mode GPT-4o for a select group of ChatGPT Plus users. The feature will be available to all subscribers in the autumn.
We’re starting to roll out advanced Voice Mode to a small group of ChatGPT Plus users. Advanced Voice Mode offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions. pic.twitter.com/64O94EhhXK
— OpenAI (@OpenAI) July 30, 2024
Alpha group participants will receive an in-app notification and an email with instructions on using the new mode. Users can converse with ChatGPT by voice and receive real-time responses without delays, as well as interrupt the AI during its speech.
In May, OpenAI introduced the latest chatbot model GPT-4o and announced the addition of a special Voice Mode for voice communication. The alpha version launch was initially planned for the end of June but was postponed by a month.
Some AI capabilities demonstrated in May were not included in the launched alpha version, such as screen sharing and video support. These features will be introduced later.
Advanced Voice Mode
The standard voice mode of ChatGPT uses three separate models:
- one for converting voice to text;
- another for processing the request;
- a third for converting text to voice.
The new multimodal GPT-4o solution differs as it does not use auxiliary models, resulting in less delay in conversation. According to OpenAI, the chatbot can detect emotional tones in voices, such as sadness or excitement.
ChatGPT can communicate using four voices recorded in collaboration with voice actors. It will not mimic other people’s speech. Additionally, filters have been added to reject certain requests for creating music or other forms of copyrighted content.
In July, OpenAI announced testing of new AI-based search features—SearchGPT.
In the same month, media reported on Sam Altman’s company working on a new technology called Strawberry, which “will significantly enhance the reasoning process of AI models and enable them to plan actions ahead.”
Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!