© matcuz/pixabay
While companies like Google and Microsoft are making a series of announcements related to AI, Apple remains very discreet on this subject. The Cupertino company simply stated that it is working on generative AI, and that it will make an announcement this year. But, behind the scenes, Apple is working hard on AI. And recently, a group of researchers from the Cupertino company published a scientific article on Arxiv, which describes a new approach that could revolutionize the way we interact with generative artificial intelligence.
In recent years, large language models, such as GPT-4 or Google's Gemini, have demonstrated their performance. However, according to the Apple article, this technology is still under-exploited for processing non-conversational information, such as elements present on a device's screen, or even background tasks. However, for interactions with AI to be natural, it must be able to respond to the user by taking into account the context and understanding ambiguous references.
To solve this problem, Apple developed a model called ReALM or Reference Resolution As Language Modeling, with a completely new approach. In essence, this involves converting non-conversational elements, such as elements on the user's screen, into textual data that can be processed by AI. Result: a user can interact with an assistant who understands what is on their screen. In an example provided by Apple, the user requests a list of nearby pharmacies. When the list is displayed, it can then ask the assistant to call a specific item or call “the one at the bottom”. Thanks to Apple's approach, the AI knows the position of the different elements present on the screen, as well as their positions.
In any case, thanks to this new approach, Apple claims better performance than GPT-4, the OpenAI model, which is nevertheless capable of receiving screenshots in its prompts. “We demonstrate large improvements over an existing system with similar functionality across different benchmark types, with our smallest model achieving absolute gains of over 5% for on-screen benchmarks. We also benchmark against GPT-3.5 and GPT-4, with our smallest model achieving performance comparable to GPT-4, and our largest models significantly outperforming them,” reads the Apple publication.
In addition, ReaLM has significantly fewer parameters than OpenAI's most recent model. As a result, Apple believes that, without “compromising performance”, its model is the ideal choice for “a practical reference resolution system that can exist on the device”. Obviously, this work makes us dream of a new version of Siri that would be more intelligent, and capable of understanding ambiguous references to elements on the screen or to applications in the background. Unfortunately, to find out what's new in Siri and iOS 18, including those related to artificial intelligence, you will have to wait for the WWDC conference in June.
📍 To not miss any news from Presse-citron, follow us on Google News and WhatsApp.
[ ]
© Dima Solomin - Unsplash After an initial warning in November on the subject, the…
© Google Google had indicated that it would make announcements related to augmented reality during…
An individual "obviously wishing to set fire & the synagogue" from Rouen was é shot…
Despite another evening marked by a few riots outside the city. New Caledonia, Thursday night…
For this 2024 edition, the stars once again caused a sensation on the Met Gala…
Découvrez en images le pire et le meilleurs des montées des marches du Festival de…