© Lukas/Pexels
Although we cannot yet speak of artificial general intelligence (or AGI), we now have access to very powerful language models that can solve complex problems or generate computer code. And the next major development in artificial intelligence will be its use of computers to perform recurring tasks instead of the user.
In 2024, the startup Anthropic presented its Computer User technology, while Google unveiled its Mariner project. And this week, it’s OpenAI’s turn to present its agent, called “Operator.” Thanks to this agent, ChatGPT no longer just responds to prompts, but can also perform actions, like a real assistant. According to OpenAI’s explanations, Operator “processes raw pixel data to understand what’s happening on the screen and uses a virtual mouse and keyboard to perform actions.”
In the demonstration below, an OpenAI employee asks ChatGPT to find a recipe, then add the ingredients to a basket on Instacart (an online grocery shopping service). After finding the recipe, the AI runs, performing the task assigned to it on an embedded browser. With Operator, ChatGPT is able to navigate the Instacart site like a human would (using the cursor and typing), to place the order.
Subscribe to Presse-citron
200% Deposit Bonus up to €3,000 180% First Deposit Bonus up to $20,000A preview is already available, but not in Europe
This new feature is based on a model that combines GPT-4o’s vision with new reasoning capabilities. OpenAI also had to train its AI to interact in the same way that humans do, with elements of web graphical interfaces, such as buttons, menus, or text fields. As a result, the feature does not depend on the APIs of operating systems or websites.
But for now, this is still just a preview of Operator, which should gradually improve. This preview is reserved for ChatGPT Pro users (OpenAI's most expensive subscription) and only in the United States. In a live presentation, Sam Altman indicates that OpenAI will make this new technology more efficient and less expensive. He also indicates that the functionality should soon arrive in other countries. On the other hand, the deployment in Europe will take more time.
Other agents are coming
OpenAI's goal is to offer a tool that can save the user time, by performing certain repetitive tasks in the user's place. However, the user will refuse to perform certain risky actions, such as financial transactions. The AI will also ask the user to validate, before finalizing a task, such as an order or sending an email. And it will request permanent monitoring when performing actions on sensitive sites, such as emails.
In any case, this is only the beginning, since OpenAI already plans to launch other agents in the coming weeks or months. And Operator will later be offered to users of the ChatGPT Plus subscription.
- OpenAI, the creator of ChatGPT, unveils its first AI agent called Operator
- It can take control of the pointer and keyboard on an embedded browser to perform tasks on the web instead of the user
- This feature is currently available in preview for ChatGPT Pro subscribers in the United States
📍 To not miss any Presse-citron news, follow us on Google News and WhatsApp.
[ ]