By Shiv Malik


How data unions will reshape the way we control our information

This article was previously published at the end of April by Nasdaq and has been republished with their permission.

Talk to most anyone working in analytics, adtech, or consumer-facing data science and you’ll soon realize the data economy is facing an existential crisis.

The privacy movement’s fightback against big tech’s hoovering up of people’s data is starting to cause significant pain. Multi-million dollar companies are swiftly shuttered with a single article in the tech press when their data collection practices are exposed. GDPR style legislation continues its global spread to new jurisdictions making the collection of personal information a serious headache for the majority of companies. Lawsuits leveraging those legal protections abound. And taking on the mantle of privacy champions, Apple and Google are now closing down major avenues for mobile tracking and 3rd party cookies on browsers.

If privacy wasn’t having enough of an effect on how data is collected and who can profit from it, policy makers have recently turned to competition legislation as the new tool to force more change. The EU’s new Digital Markets Act bill contains powerful but largely unnoticed provisions on how the biggest tech companies, or “gatekeepers” must now handle consumer data. Come early 2024, business and ordinary users of Google, Meta, Apple, Amazon and other platforms in the EU, will be able to port their data in real-time from those platforms, just like people can currently port data from their banks; at the touch of a few buttons. Having invested so much in creating data silos to give them an era defining dominance, the data silos of the tech giants might be brought down with a few swift clicks.

And yet, demand for personal data and the need for far better ad tech solutions continues to grow. The rise of AI is heating this demand up further. As it does, it’s clear the status quo, what Shoshana Zuboff neatly coined as “surveillance capitalism” has reached its zenith. The question on everyone’s lips is what comes next; what replaces the system that has dominated the data economy, and therefore the digital economy, for well over a decade?

Existential crisis often provokes a reversion to fundamentals. Old ideas, lost to the cementing of a consensus, are re-discovered and looked at anew. The basic problem is still as old as that infamous 1993 New Yorker cartoon; on the internet, no one knows you’re a dog. When a user turns up at your website, whether from a laptop or mobile, and you want to serve them an advert or service, how do you know how old they are, which country they’re from, how affluent they might be in order to best tailor that service? How in fact, do you know they’re even human?

Cookies and mobile tracking have been an inefficient and highly unethical way of getting around that problem; inject tracking bugs into people’s computers or simply stream live geolocation information from ‘free’ apps such as QR code readers, or call to prayer alert apps. Users say yes, because they don’t know any better; the fine print permissions are buried in dozens of pages of legalese. When they find out, they’re horrified.

Early internet pioneers instead dreamed that people would keep data about themselves - addresses, age, income - in their own software on their own devices. They would seal it within a personal data vault (PDV) and only share it with those they trusted and where there was clear benefit to them.

It’s a simple, ethical approach. One that has certainly attracted the founder of the web, Tim Berners-Lee. The problem with this solution has always been how do you get people to fill up their data vaults with information that can be read by others at scale? If Alice and Bob decide to fill their vaults in different ways using different data schemas (assuming of course they even care to do so), then how is any third party going to read those vaults in a way that scales? Global identity protocols like passports are hard enough to negotiate when customs and languages vary massively. But if you want more than a name, date of birth and a photo to add to your digital story, then this sort of coordination is not going to come from individuals acting spontaneously under their own steam, even if you hand out vaults for free.

Enter data unions. Again the idea is simple, and it has an equally long technological lineage. People download an app, or click on a few buttons allowing them to share their data with an organization that offers to represent them and their fellow members, rather like a cooperative. The more people join, the more powerful the data set becomes. When it sells, the proceeds are redistributed amongst the members. And the organizers get their cut too. Everyone wins.

You even get the benefit of scalable personal data vaults. People simply have to copy what they are already sharing with their data union, back to themselves and store it. And because the data has been pre-structured by the union operators, it’s easily readable at scale by third parties like an online insurer, credit checking company or ad server. The more data unions a person joins, the richer their own digital history becomes, the more money they make and the better served they will be. This is the technological and product leap the data economy needs to escape its existential crises.

Until now it’s been difficult to get data unions off the ground. The missing pieces have been scalable payment rails - blockchains and data union infrastructure organizations now cater for this - and the data buyers’ demand for a better solution. Trust is also an issue but the EU is now supporting the development of data unions (or data intermediaries as they term them) with funding and tailored legislation that will give unions regulated trust status in the data economy and official obligations towards their members. And where the EU goes in tech legislation, others soon follow.

It’s no wonder we’re already starting to see data unions raise millions in funding and grow their memberships to the hundreds of thousands. Ozone, a browser plugin that helps users monetize clickstream data has over 100k members. Cake helps 120k members in Belgium anonymize then monetize their financial transaction data from their bank.

The advent of legally mandated real-time data portability will mean people will be able to send their data to a union with a few clicks of a button. Unions will be unbundling Google, Amazon, Apple and Meta’s data silos for years to come. More legislation from the EU will also help data unions to flourish in the IoT sector, creating new unions out of data from your car, washing machine, smart speaker and FitBit. By the end of the decade, signing up to a data union will become as normal as having any other specialist or consultant like a lawyer, accountant, investment broker or labour representative, who works for your interests. The overall result; a more efficient, fairer, economically more powerful and technologically more advanced data economy.

Shiv Malik is the CEO of Pool, an infrastructure provider for Data Unions

