Pictures, text prompts, documents and health metrics are just a few examples of data we’re giving away to different AI applications and thus, to different companies/organizations. While it’s always good to know what happens to your data, it is especially relevant in healthcare settings and regarding health data.
The rise of social media platforms signifies the beginning of a new era, one where individuals become valuable resources: providing tons of personal data that can be utilised for commercial purposes. But as the plethora of AI applications begins to enter our lives, we have to level up our games and become much more conscious about what we are willing to give away.
Thus, we decided to take a look at what there is to learn about what happens to data uploaded to and generated by the most popular AI platforms.
There are more things in heaven and Earth, Horatio
This whole issue is far more complicated than it seems at first glance, as there are multiple layers of privacy/legal issues. In general, these include:
- Personally identifiable information (including sensitive data like social security numbers, financial identifiers, health data, etc) collected by companies/organizations
- Information we provide to these algorithms including personal (user) info, financial (billing) info and our prompts (sample pictures, business or personal questions, health symptoms, etc.)
- Copyright issues regarding any material or content we create with these services and the legal implications of us using these for personal or business purposes
- And copyright issues of our publicly available content that might have been used to train an algorithm without our knowledge
Let’s start with the obvious: right now we are very far from seeing clearly in these matters, and there’s a lot going on with legal experts trying to weigh in and provide guidance. The situation is especially confusing in the latter two facets.
Midjourney: a scary list of collected personal info
Text-to-image generator Midjourney provides you with the following info:
At this moment I can’t decide whether the extensive list of info they declare to collect AND disclose in California including
- social security, driving license and passport numbers,
- postal addresses,
- insurance policy number,
- education, employment, employment history,
- bank account number, credit card number, debit card number, or any other financial information,
- medical information, or health insurance information
are also collected and disclosed in Europe, or the GDPR regulations protect such personal data of EU users.
Also quite inscrutable for me is
- How do they collect such personal information
- Also not sure for what purposes they plan to collect and disclose all the deeply personal and sensitive info listed above.
- Impossible to decide what happens to what kinds of data of users not residing in the EU or in California
Applications from Open AI (DALL-E, ChatGPT): let’s not talk about it
“As between the parties and to the extent permitted by applicable law, you own all Input, and subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output. “
Ada Health: confusing in the best European fashion
Germany-based Ada offers AI-backed mobile health app services for users. As with any decent European enterprise, you will find a book’s worth of privacy information under the relevant section of the website, starting with GDPR (in bold), and an assurance that it is vitally important for them that customers should feel secure when using their services.
Trying to make sense of it, this is what I found:
- The interesting part comes under section 6 though, when we get to “disclosure of personal data” with third parties. Based on my understanding, they work with a number of US service providers, whom they asked nicely to behave in a proper European way, please.
- Apart from that they assure users they will not transfer personal data to third parties unless the case falls in the listed exceptions, which includes basically anything: they buy or sell assets, their company gets acquired, and here users’ rights to opt-out are not listed.
- Another interesting snippet is the list of their “third-party processors to provide infrastructure services” which is a long list that includes Amazon, Google and Facebook.
- The whole document is extremely confusing, statements like “We will never share your personal health information with advertisers or third parties” are followed by “A full list of our third-party processors processing your personal data on our behalf and strictly according to section 3 above can be found here.” and “we do not transfer your personal data to third parties – with the exception, when applicable, of the purposes listed below”.
All in all, this is a prime example of how you basically have no idea about what happens to your personal information, even after reading the relevant sections of the document multiple times.
Open questions: well, GDPR is supposed to ensure my privacy rights, but despite that, I still don’t understand what happens to my data and what control I have over it.
AI-based voice-over Revoicer: we have your data, thanks for all the fish
The Revoicer homepage also doesn’t say a word about the usage rights of generated content. This can be problematic in certain cases. Theoretically, I could use the platform to create the audio parts of my next mega-hit YouTube video generating 7 billion views. No idea what happened if the company requested compensation for the assumed financial results of this.
Open questions: Re personal data: nothing. They collect your data and use it in every possible way they can. Re content usage rights: everything.
AI-based video generator Synthesia: you have the content, we have your data.
Regarding processing personal information, they seem to mix the Ada and the Revoicer approach. There is a looong text frequently mentioning GDPR and the information that they collect data, track users and allow Google, Facebook, Hubspot and Stripe to collect analytics data about users.
Meeting transcription AI Fireflies has rights to all of your content
Another extremely confusing case is Fireflies.ai, a tool that is “used across 62,000 organizations”. The software is supposed to provide you assistance in taking automated notes and summaries of meetings.
Their “terms of service” is an exceptionally miserly read. Through a long and highly user-unfriendly pdf, they list all your responsibilities and decline any of their own.
For my layman’s brain, this means that whatever I use the service for can be published – which makes me seriously wonder what those 62,000 organizations were thinking when opting in.
Open questions: why would anyone use such a service???
Give me the takeaways, man!
If you got this far, I’m sure you already surmised a key takeaway: don’t take anything for granted. These were just a handful of random examples from a wide range of AI applications.
We covered some that are currently mostly used by the masses for fun (Midjourney, DALL-e, ChatGPT), some that are used by individuals for personal purposes (ADA health) and some that are most likely utilized in the corporate world (Revoicer, Synthesia, Fireflies). As you see, the privacy and potential legal issues are far from being settled and can be hugely different between various apps.
Whenever you give away personal or corporate data and/or create personal/corporate content, you need to take the time (and misery) to dive into these typically verbose documents and learn for yourself what kind of trade you are expected to make.
The post Here Is What Even Healthcare AI Companies Do With Your Data appeared first on The Medical Futurist.