© matcuz/pixabay
While companies like Google and Microsoft are making a series of announcements related to AI, Apple remains very discreet on this subject. The Cupertino company simply stated that it is working on generative AI, and that it will make an announcement this year. But, behind the scenes, Apple is working hard on AI. And recently, a group of researchers from the Cupertino company published a scientific article on Arxiv, which describes a new approach that could revolutionize the way we interact with generative artificial intelligence.
In recent years, large language models, such as GPT-4 or Google's Gemini, have demonstrated their performance. However, according to the Apple article, this technology is still under-exploited for processing non-conversational information, such as elements present on a device's screen, or even background tasks. However, for interactions with AI to be natural, it must be able to respond to the user by taking into account the context and understanding ambiguous references.
To solve this problem, Apple developed a model called ReALM or Reference Resolution As Language Modeling, with a completely new approach. In essence, this involves converting non-conversational elements, such as elements on the user's screen, into textual data that can be processed by AI. Result: a user can interact with an assistant who understands what is on their screen. In an example provided by Apple, the user requests a list of nearby pharmacies. When the list is displayed, it can then ask the assistant to call a specific item or call “the one at the bottom”. Thanks to Apple's approach, the AI knows the position of the different elements present on the screen, as well as their positions.
200% Deposit Bonus up to €3,000 180% First Deposit Bonus up to $20,000More suitable than ChatGPT
In any case, thanks to this new approach, Apple claims better performance than GPT-4, the OpenAI model, which is nevertheless capable of receiving screenshots in its prompts. “We demonstrate large improvements over an existing system with similar functionality across different benchmark types, with our smallest model achieving absolute gains of over 5% for on-screen benchmarks. We also benchmark against GPT-3.5 and GPT-4, with our smallest model achieving performance comparable to GPT-4, and our largest models significantly outperforming them,” reads the Apple publication.
In addition, ReaLM has significantly fewer parameters than OpenAI's most recent model. As a result, Apple believes that, without “compromising performance”, its model is the ideal choice for “a practical reference resolution system that can exist on the device”. Obviously, this work makes us dream of a new version of Siri that would be more intelligent, and capable of understanding ambiguous references to elements on the screen or to applications in the background. Unfortunately, to find out what's new in Siri and iOS 18, including those related to artificial intelligence, you will have to wait for the WWDC conference in June.
- Apple has not yet introduced products based on generative AI, but it is working on this technology behind the scenes
- In a scientific article, Apple researchers present a model called ReALM, which has a big advantage over models such as GPT-4
- This would better understand non-conversational elements like objects on the screen, or activities that occur in the background on a device
📍 To not miss any news from Presse-citron, follow us on Google News and WhatsApp.
[ ]