© mayam_studio/Shutterstock. com
Ircam Amplify, a French startup specializing in audio technologies, has just launched the AI Speech Detector. This tool is capable of identifying voice content generated by artificial intelligence (AI) with a 98% accuracy rate. A device that is all the more relevant, as experts are alarmed by the resurgence of audio deepfakes.
Ircam Amplify is a jewel of French industry. The company is a direct descendant of the Institute for Research and Coordination in Acoustics/Music (Ircam), founded in 1977 by Pierre Boulez. It draws on years of research and development in the field of sound to offer unique solutions on the market.
For example, large companies have called on its expertise to carry out unique marketing operations, such as offering tastings of the sound of champagne or reproducing the scent of a perfume in sound.
Given Ircam Amplify’s expertise, it was obvious that the firm would be ready to meet the challenges of generative AI. A few months ago, it launched a tool called AI Music Detector, to allow music industry players to identify AI-generated tracks by analyzing and detecting information within the files.
“ You send us a track and we are able to tell if it was generated by AI and, as a bonus, with a confidence score. For example, 99% or 65% », explains Romain Simiand, product director at Ircam Amplify, in an interview with Presse-citron. The company is «the first in the world » to offer such a solution, he assures, which is also based on internal technology. The AI Music Detector is necessary for distributors and labels who want to protect themselves from fake artists by giving them, in particular, a considerable saving of time.
Based on the effectiveness of this product, Ircam Amplify has duplicated and refined it so that it can also detect voices generated by artificial intelligence. This version, called AI Speech Detector, responds to “the need to detect deepfakes or voice clones, anything that corresponds to the use of voice for fraudulent reasons“, explains the manager.
Quite simply, the technology is able to recognize a voice that has been created using generation software. Ircam Amplify has taken care to select the most popular models on the market and to train it on them. « There are hundreds of voice clone and speech synthesis models. However, most of them are based on the same open source models. Developers adjust them slightly or retrain them a bit, but overall, the basic principles remain the same ,” Romain Simiand explains.
200% Deposit Bonus up to €3,000 180% First Deposit Bonus up to $20,000The company selected three proprietary models, including the controversial ElevenLabs, used to develop an audio deepfake of Joe Biden, as well as three open source solutions. These are the most “accessible devices for malicious people or those who just want to have fun with them ,” continues the Fine Arts graduate. If a voice file is generated through one of these solutions, the AI Speech Detector will be able to identify it with an accuracy rate of 98.5%.
© Ircam Amplify
These types of alternatives are in growing demand as generative AI tools become more comprehensive and are being exploited by cybercriminals to pull off all sorts of scams: imitating voice of a leader corporate identity to authorize fraudulent transactions, fabricate compromising recordings for blackmail, simulate emergency calls or testimonies to support false claims, etc.
Romain Simiand illustrates the practical utility of the AI Speech Detector by highlighting its ability to automate the processing of audio files: “In a call center or an editorial office, the system can be programmed to automatically transfer a file deemed authentic to a journalist or the editorial office. Conversely, if the tool is 99% sure that it is AI-generated content, the file can be directly discarded, which avoids any loss of time ».
Its launch on the market also addresses public safety issues and the fight against disinformation and digital identity theft.
Since October 15, the tool has been available via the Ircam Amplify API. Described as an Audio-as-a-service solution, it includes audio products developed by researchers, ranging from analysis to generation through generative AI. “ Any industrialist or serious client regarding their audio needs can connect to us once and consume what we offer “, assures the product director.
A small downside for now, the AI Speech Detector cannot detect audio deepfakes in real time, but it still claims a commendable detection time of around fifteen seconds. And the startup does not intend to stop there. She is now working on refining her product.
«Our technology will make it possible to break down a piece of music into different components: voice, guitar, drums, brass, etc. Thanks to this functionality, we will be able to analyze each element separately, thus determining whether the voice was generated artificially, while verifying the authenticity of the other instruments ”, says Romain Simiand.
In the future, it is not excluded that the AI Speech Detector will evolve to integrate the detection of new voice generation models, reinforcing its versatility. Designed with international ambition, the tool has the potential to become a global reference in the decisive fight against audio deepfakes.
📍 To not miss any Presse-citron news, follow us on Google News and WhatsApp.
Mike Tyson has been defeated in the ring by Jake Paul. The youngest heavyweight champion…
Photo: Andrew Harnik Associated Press This week, the Wall Street Journal revealed that Donald Trump’s…
Photo: Bernat Armangue Archives The Canadian Press Residents of Kherson sang the national anthem during…
Mike Tyson has been defeated in the ring by Jake Paul. The youngest heavyweight champion…
Mike Tyson has been defeated in the ring by Jake Paul. The youngest heavyweight champion…
© Unsplash/Omar Al-Ghosson In 2007, when the iPhone was released, Apple launched several applications natively…