Spread the love

Artificial intelligence, enemy number 1 of your privacy ?

© Image generated by DALL-E AI for Presse-Citron

AI and machine learning have revolutionized extremely varied fields: IT, finance, medical research, machine translation, etc. The list grows from month to month, and this is just the beginning. However, this progress is accompanied by a question that comes up quite frequently: that of the impact of these technologies on our private lives and confidentiality. Indeed, whatever the AI ​​model in question, they are developed by feeding on a gargantuan quantity of data, including some may be quite sensitive.

When the AI ​​memorizes your secrets

One of the biggest challenges facing AI training companies is the ability of these technologies to learn and remember complex patterns from their training data. This characteristic, although beneficial for improving the precision of models (preventing them from hallucinating, for example), nevertheless represents a risk certain for privacy.

Indeed, machine learning models  (algorithms or systems that allow artificial intelligence to &#8217 (learning from data), which can have billions of parameters, like GPT-3 which has 175 billion, uses this vast dataset in order to minimize prediction errors. This is where the problem lies: by adjusting their settings, they may unintentionally retain certain specific information, including sensitive data.

For example, if models are trained on medical or genomic data, the latter can memorize private information likely to be extracted by targeted queries, thus endangering the confidentiality of persons concerned. Let's imagine that a hack or accidental data leak occurs in the organization that holds these models, this information could be leaked by malicious people.

< h2>AI and the prediction of sensitive information

AI models can also use seemingly innocuous data to infer sensitive information. A striking example is that of the Target brand (American supermarket chain), which succeeded in predicting pregnancies by analyzing the purchasing habits of customers. By cross-referencing data such as the purchase of food supplements or unscented lotions, the model was able to identify potentially pregnant customers and send them specific advertisements. This case demonstrates that even seemingly mundane data can reveal extremely personal aspects of private life.

Despite efforts to limit data memorization, most current methods have proven ineffective. There is indeed an existing technique currently considered to be the most promising for guaranteeing a certain confidentiality in the learning of models: differential confidentiality . But this is not miraculous as you will see.

Differential confidentiality: an imperfect solution?

To explain simply what differential privacy is, let's take a simple example. Let's say you participate in a survey, but you don't agree with anyone knowing about your participation or your responses. Differential privacy adds some “noise” or chance to survey data, so that even if someone accesses the results of this one, they cannot know for sure what your answers are. It therefore anonymizes the data while leaving room for analysis without compromising your privacy.

This method has been adopted by giants in the sector, such as Apple or Google. However, even with this protection, artificial intelligence models can still make conclusions or predictions about personal or private information. To prevent such breaches, the only solution is to protect all data transmitted to the organization, an approach known as local differential privacy.

Despite its advantages, differential confidentiality nevertheless presents certain limits. Obviously, it was too good to be true. Its main disadvantage is that it can induce a fairly significant drop in performance in machine-learning methods. Consequences: models can be less precise and provide erroneous information and are much more time-consuming and expensive to train.

There is therefore a compromise to be found between , on the one hand, obtaining satisfactory results and on the other, sufficient protection of the privacy of individuals. A very delicate balance that it will be essential to find, and above all to maintain in the future as much as the l’ AI will continue to expand. If it can help you in your daily life, whether for professional, personal or academic use, do not consider it as the ally of your confidentiality, far from it.

    < li>AI models, during their training, can retain sensitive information.
  • From innocuous data, they are even capable of ;#8217;deduce conclusions compromising privacy.
  • A method, differential confidentiality, is used to limit this phenomenon, but it is far from &#8217 ;be perfect.

📍 To not miss any news from Presse-citron, follow us on Google News and WhatsApp.

[ ]

Teilor Stone

By Teilor Stone

Teilor Stone has been a reporter on the news desk since 2013. Before that she wrote about young adolescence and family dynamics for Styles and was the legal affairs correspondent for the Metro desk. Before joining Thesaxon , Teilor Stone worked as a staff writer at the Village Voice and a freelancer for Newsday, The Wall Street Journal, GQ and Mirabella. To get in touch, contact me through my teilor@nizhtimes.com 1-800-268-7116