© Iv-olga/Shutterstock.com
Google’s Gemini AI models can already answer our questions, help us organize ourselves, write documents, or even code applications. But in the not-so-distant future, Gemini could also… drive vehicles. In any case, this is the new avenue being explored by Waymo, the subsidiary of Alphabet (Google’s parent company) specializing in autonomous vehicles and robotaxis.
Today, Waymo is the leader in its field. The Alphabet subsidiary already offers a competitor to Uber that operates autonomous cars in a few American cities, and which carries out more than 150,000 trips per week. And while Waymo is happy with the technologies it currently uses, it is now exploring the possibility of improving its autonomous vehicles by using Gemini's intelligence.
In a recent publication, Waymo presents a scientific article in which it describes a new technology called End-to-End Multimodal Model for Autonomous Driving. “Powered by Gemini, a large multimodal language model developed by Google, EMMA uses a unified end-to-end trained model to generate future autonomous vehicle trajectories directly from sensor data. Trained and optimized specifically for autonomous driving, EMMA leverages Gemini’s vast global knowledge to better understand complex scenarios on the road,” Waymo’s statement reads.
Waymo’s current approach relies on multiple independent modules to perform the various tasks of autonomous driving. The advantage of this system is that it makes it easier to debug and optimize each module separately. However, it has a scalability problem. And this system would have difficulty adapting to new environments, because it is optimized for targeted scenarios.
200% Deposit Bonus up to €3,000 180% First Deposit Bonus up to $20,000The use of large multimodal language models (which understand both text and images) could solve this scalability problem. “Indeed, MLLMs, as general-purpose baseline models, excel in two key areas: (1) they are trained on large, internet-scale datasets that provide rich “world knowledge” beyond what is contained in common driving logs, and (2) they demonstrate superior reasoning capabilities through techniques such as thought chain reasoning,”, Waymo’s paper reads.
But for now, while the potential for using generative AI in self-driving cars is huge, Waymo believes there are still significant challenges ahead. For example, Waymo’s EMMA system still has limitations in its ability to process video. Additionally, it only understands images, not data from more complex sensors, such as LiDAR sensors.
“While EMMA is showing promising results, it is still in its early stages with challenges and limitations in onboard deployment, spatial reasoning capability, interpretability, and closed-loop simulation. Despite this, we believe our EMMA findings will inspire further research and advancements in this field,” the Waymo paper says.
📍 To not miss any Presse-citron news, follow us on Google News and WhatsApp.
[ ]
Until the last hour of the last day. Kamala Harris and Donald Trump continue their…
Both in a rally, Trump and Harris pass the buck on rights women and immigration.…
Photo: Melissa Sue Gerrits Agence France-Presse The river bursting its banks on September 28 after…
Photo: Le Devoir Le Devoir Published at 0:00 United States D-Day is approaching for Donald…
© Unsplash/Isaac Mitchell Winter is coming and with it its share of new regulations. From…
© Temu/Freestocks/Presse-citron This is a case that Temu could have done without. The European Commission…