Do you use chatbots and generative AI tools? If so, your data may be used to tweak, refine and otherwise train the technology. And even if you don’t, you might find your data becoming a training ground for a technology that people consider as world-changing as the internet has been.
Artificial intelligence needs data, and lots of it. Luckily for all the companies who are working on this technology, all we seem to do is generate data, providing it with a rich source of material to make their systems even smarter.
Unluckily for them, not everyone is keen on having posts, photos and other data sucked into the AI machine and used as fodder. And for EU citizens and users, there are stringent privacy protections available that force companies to give us the chance to opt out of having our data used in this way.
So how do we do it? It depends on the service.
Meta AI
Facebook and Instagram owner Meta last year tried to implement training for its AI models on European user data. The logic? If the AI assistants and features are going to be relevant to EU users, then they need to have access to local data to train them.
So it began notifying users that it would use publicly available data – posts, comments and so on – to train its AI. The backlash was swift, and after some scrutiny from data protection regulators and a few complaints from privacy campaigners, Meta put the plans on hold while it thrashed out the details with the powers that be.
That has all been finished off. A few weeks ago, Meta AI began rolling out to Facebook, Instagram, WhatsApp and Messenger for EU users – whether they wanted it or not. For now, it is limited to an AI chatbot, but there are plans to introduce more features in the future, similar to those available in the US.
The catch, of course, is that Meta again wants to use your publicly available data to train its AI. You may have seen the notifications pop up on your Meta apps about the plan, and the good news is that you can object. It is best to do it by May 26th to make your wishes clear, but you can still object at any point after that – you just run the risk that some of your data will have been used already.
So how do you object? The good news is, if you objected last time around, Meta will honour that request. If you haven’t, there are a couple of ways to do it.
On Facebook, open the app. Tap your profile picture at the bottom right corner. Scroll down to settings & privacy, tap settings and scroll down to find privacy policy. When you open that page, it should have a paragraph stating Meta is updating its privacy policy, and a link to object to the plan. Tap the link, fill out the form and add any additional details you feel are relevant. You should get a confirmation email almost straight away.
That request will also be honoured for any other accounts that you have in the same Meta accounts centre such as Instagram.
If you want to object specifically to Instagram data being used, you can do so through the app by following the notification, or by logging into your Instagram account from a browser and navigating to this page: https://help.instagram.com/contact/767264225370182.
Google has gone all-in on AI. From updated chatbots (Google Gemini) to AI-powered overviews in search, the company is integrating it into many of its products, including Android.
We have covered Google’s apps activity before, and how to limit the data the company stores on you. But if you want to limit the data shared with Google to train its AI, you can also turn off Gemini apps activity, which will stop Google saving your conversation history with the chatbot.
In the Gemini app, tap on your account in the top corner, select Gemini apps activity and tap “turn off” from the drop-down box. You can also use this menu to delete previous activity.
ChatGPT
OpenAI has found itself on the wrong side of a few lawsuits over its use of content to train the models for ChatGPT. In the US, the New York Times, along with several other US news publishers, is suing the company, alleging it has scraped its articles without permission.
So where does that leave our personal data? The main way ChatGPT can use your data for training is if you interact with it. However, there is a way to opt out. While logged on, go to settings, tap data controls and set improve model for everyone to off.
Anthropic
Anthropic’s privacy policy is clear: its AI Claude will only use your conversations to train its models if you have explicitly given permission by joining a training programme or unless you’ve reported the materials to the company. The exception is material that is flagged for violating Anthropic’s usage policies, which may be used to train systems to help detect harmful activity.
Tumblr
Does anyone still use Tumblr? The platform has had somewhat of a revival in recent months, with Gen Z driving the growth. But it could also be providing data for AI models. Last year, it emerged that owner Automattic was doing a deal with AI companies that would allow them to use public data from Tumblr to train AI models.
Like other platforms, the deal does not include any private messages or content from private or password-protected blogs. It also excludes data from deleted blogs or ones classed as explicit.
The good news is that even publicly accessibly blogs can opt out. On the app, go to your account settings and select the blog you want to exclude from sharing with licensed partners of Tumblr. Tap on the gear icon, and select visibility. Make sure prevent third-party sharing is turned on. On the web, you will find the same options under blog settings.
Wordpress
Automattic also owns blogging tool Wordpress, and the same licensing deal covers public content there. To opt out, go to settings, select general and go to the privacy section. Tick the box next to prevent third-party data sharing.
X
X, the platform formerly known as Twitter, has its own AI called Grok. When the company first unveiled the feature, it was pitched as an AI with a “rebellious” streak.
But is it training off our data? Last year, Elon Musk’s platform fell foul of the Data Protection Commission (DPC) here for opting users into data sharing for training its AI without informing them first. The DPC started a court action, and X agreed to suspend the processing of the personal data of EU and EEA users contained in public posts processed between May 7th and August 1st, 2024, for the purpose of training its AI.
In September, the DPC ended the court proceedings after X agreed to permanently limit its use of EU personal data, but last month the regulator said it had opened an investigation into the matter.
For everyone else though, you can opt out of the data sharing. Go to your profile, select settings, and go to privacy and safety. Scroll down to data sharing and personalisation, and select Grok and third-party collaborators. It will show you what Grok can use as source material for its models, for users not in the EU, such as your public posts and conversation history with the chatbot. You can also delete your previous conversation history with Grok if you have interacted with the chatbot.
Apple
Last year, Apple announced it would build generative AI into the operating systems for its iPhones, iPads and Macs. Apple Intelligence is pitched as a helpful assistant at your fingertips, a smarter assistant that can dig around your phone and find the information you know is there, but just can’t locate, and it will serve it up to you in an easily accessible format.
But does Apple use your private data to train Apple Intelligence? The short answer is no. Apple has trained its AI models on high-quality data from the public web, from which publishers could opt out. Among the information it has licensed to train the models are news, archives, textbooks, stock photography and in-house work.
Above all, data you store on its devices stays in your control. A lot of the simpler requests are processed on your device. If it needs extra power, Apple uses something called Private Cloud Compute, an extension of your iPhone that is not accessible by Apple or any third party. The service only has access to the information that is pertinent to your request. And to make sure that everything is above board, Apple has an independent observer to carry out an audit.
So no general cloud processing for your data or third-party access to your data. The one exception to this is where you use ChatGPT for queries that Apple Intelligence cannot answer from your own personal data.
Apple’s deal with OpenAI is opt-in only and you are asked to agree every time a query is identified as something that ChatGPT could handle.
You don’t need a ChatGPT account to access it, user IP addresses are obscured, and OpenAI is not allowed to store any requests that come through Apple’s system.
However, if you already have a ChatGPT account and want to access your premium features, you can connect it to your phone; OpenAI’s policies on using data would then apply.