© Primakov/Shutterstock.com
Six months ago, OpenAI made a big splash by introducing its new GPT-4o model, as well as the new advanced voice mode. This feature leverages GPT-4o’s ability to directly understand (without going through intermediary models) audio format and images, to offer a new interface that allows you to chat with AI in a fluid and natural way, as if you were chatting with a human. Today, this advanced voice mode is already available on ChatGPT, even for French users. But OpenAI is preparing an even more impressive new feature: live video.
An AI that listens to you, and can see
As you can see in the demo below, it will be possible to chat with ChatGPT’s AI, while activating the camera (and GPT-4o is able to understand what you show it in the video). It’s like making a call with a real person, with the video activated. In other words, AI will not only be able to hear you, but also see you.
Subscribe to Presse-citron
And according to our colleagues at Android Authority, some Internet users have already had the chance to test this new feature on an alpha version.
The beta version is coming soon
In addition, soon, OpenAI could offer this new feature to more users. By digging into a beta of the ChatGPT application, Android Authority would have discovered elements suggesting that the launch of a beta version of this feature is in preparation. As a reminder, the passage from an alpha version to beta means that the feature is getting closer to the final version, and that the developer is ready to test it with a larger number of people. As part of the beta test, OpenAI would call this feature “Live camera”. This would also have included a warning asking the user not to use ChatGPT's vision for navigation or for any other decision that has an impact on their health or safety.
Competition is increasingly tough in the field of AI
While OpenAI popularized generative AI, it is now facing competition from other AI labs, such as Anthropic or the French startup Mistral, but also Google. On its Gemini application, Google already offers a mode called Gemini Live that competes with the advanced voice mode. And recently, Google has also launched a Gemini application for iPhones that allows access to Gemini Live.
- In May, OpenAI introduced ChatGPT’s advanced voice mode, which lets you chat with AI as if you were talking to a human
- This feature is already available, but the company is working on another feature that will add live video to these interactions
- According to Android Authority, this vision feature on ChatGPT is already available in alpha and could soon be offered in beta
📍 To not miss any Presse-citron news, follow us on Google News and WhatsApp.
[ ]