The End of Data Monopolies: Europe’s independent raw data layer and the rise of Data Unions

By Shiv Malik


By Shiv Malik
The End of Data Monopolies: 
Europe’s independent raw data layer and the rise of Data Unions

Here’s an epic quandary. We live in a world supposedly awash with quintillions of bytes of data and yet, unless you’re one of a handful of Silicon Valley companies, you just can’t get the data you want. Speak to any data scientist, marketeer or app builder out there and they’ll likely tell you that no, they can’t get at the data they really need to be able to analyse, make decisions or build. Why? 

The short answer is that any company who is lucky enough to control a major, widely applicable source of raw data, (most especially personal data), will seek to lock it down immediately and reap the proceeds by either monetizing it through advertising or analytics or both. That’s the standard Web2 playbook.

Whether that’s data about our web behaviours, financial data, or geolocation information, companies who have access to this information will almost always sell consumer eyeballs to the “right” consumers or answers to questions, (e.g. how many people pass by this location in a given day), rather than allow access to the raw data sets themselves. This is despite the fact these companies might well make even more revenue by selling that raw data to a much wider market.

So why silo when there’s so much latent demand for the raw stuff? One could get wrapped up in discussions about whole product solutions and providing ultimate value, but the truth is that Silicon Valley companies don’t want competition. Take Apple’s geolocation data they collect on us each and everyday. Or Amazon’s marketplace data which reveals the economic purchasing behaviours of tens of millions of people. Or what about Google’s web tracking information on the scale of billions? If, say, these companies were forced to sell the raw data they collect, innovation would increase massively. 

You could imagine new search engines, marketplaces and map apps popping up, at a rate of knots. You could also imagine thousands of new companies taking that live raw data and developing it into new Layer 2 services. (An Amazon shopping assistant or a web browser that filtered out all the junk from Google anyone?). Or completely separate applications which we’ve yet even to imagine. 

And that’s the problem. Google, Apple, Amazon et al., don’t want anyone else creating better versions of their existing products. Or worse, innovating in fields which they see as theirs to own. They want to control the entire pipeline of data from ‘creation to query’. This also explains why when they see a data threat, (Waze, Fitbit, Instagram) they buy it and silo even more data. When your competitive edge is your monopoly, why give it up? This is when regulators in a capitalist system are meant to step in because free markets and monopolies aren’t meant to co-exist. And it looks like the EU is doing just that.

Digital Policy 2.0

Even the least keen observer of digital policy will have noticed that Europe’s political bodies have been creating legislative acts at an astonishing rate. There are currently three game changing bills making their way through to the final stages towards full adoption in March; the Digital Services Act (DSA), the Digital Markets Act (DMA) and the Data Governance Act (DGA). And if all that wasn’t enough there is a fourth act being considered, the as yet unpublished Data Act.

Whilst the DMA and DSA have grabbed most of the headlines, you need to study all three acts to really connect the dots on how this legislation could spell the end for data monopolies and the start of a new innovative future.  

Before stepping into the substance, let’s first ask this: what would be the key ingredients needed to create an independent raw data layer from all the existing personal data we all generate each day? 

Firstly you’d need regulations so that users could take their data out of GAFAM platforms. Next, you’d want to have trusted organisations that GAFAM users could then move that data to. Those organisations might well monetize and govern that data on a user’s behalf. And finally, you’d need rules to ensure those new data governance organisations didn’t simply rinse and repeat the siloing game. 

That all sounds like a tall order. But if you piece together articles in the DMA and the DGA and then start to read the runes when it comes to the Data Act, you realise all the parts are right there, hidden in plain sight.

Which regular Joe knows how to utilise a JSON file? It’s possibly the most inconvenient, cumbersome digital right one could imagine. No longer.

Real time Data Portability   

Believe it or not, Europeans already have the right to port their data out of any company they choose. Article 20 of GDPR gives everyone a right to data portability. The problem is that this right was drafted for the postal age. To actually port their data, people must write to the company in question, wait 30 days to receive their data, and then figure out what to do with it when it arrives in some obscure file format. (Which regular Joe knows how to utilise a JSON file?) It’s possibly the most inconvenient, cumbersome digital right one could imagine. No longer. 

Article 6.1 subsections (h) and (i) of the Digital Markets Act contains an updated version of GDPR’s article 20. It lets businesses and end users port their data from digital giants (or ‘gatekeepers’ as they are termed in the legislation), to a third party, in real time via an API. In other words, through a few clicks you’ll be able to send your data in a continuous stream to any other company you choose. 

These provisions in the DMA might only be the warm-up act for the Data Act which should be published in February. Reading the Data Act consultation, it’s likely that a more generalised right to port data in real time might be forced upon all gadget makers including of course cars, not just GAFAM ‘gatekeepers’. 

A place to send your data to 

So now you can move your Amazon, Facebook and Google data, where do you send it and why would you bother? That’s where you need to turn to the DGA. Although a lot of the act deals with how public bodies in EU member states must unlock and share data even more widely, it’s the passages which outline the creation of Data Intermediaries (DI) also known as data cooperatives or data unions, which are the most relevant.

What’s a Data Intermediary? The legislation gives a good example: “Data intermediation services would include… data pools established jointly by several legal or natural persons with the intention to license the use of such a pool to all interested parties in a manner that all participants contributing to the pool would receive a reward for their contribution to the pool.” 

Basically these data intermediaries would act as cooperative or credit union-like structures that enable people to club their data together and sell it on their behalf. The union operators could take a cut of the sales but under the regulation they would have a fiduciary duty to their members to return value to them and act in their best interests. 

Did Europe really need regulation to get these going? These organisations already exist. Utilising open banking regulations, Cake in Belgium is a great example. But this new ‘official’ category of a data intermediary comes with some useful perks. Not only will these data unions or cooperatives be backed with €2bn in funding to get set up, they will also be given an EU wide trust mark, an incredibly valuable asset when the data economy is such a murky world and the biggest enterprise players are desperate to purchase data that won’t embroil them in scandal.

Restrictions on Data Unions 

So what’s to stop a data intermediary/union from siloing the raw data once they’ve persuaded you to port it to them from Google and Amazon? The legislation (see specifically Article 11.1) goes to some lengths to specifically ban data intermediaries from providing the sale of analytics from the same company that sells the raw data.

While data unions can “make adaptations to the data exchanged, in order to improve the usability of the data” they cannot provide “any other services” without setting up a new company. When combined with fiduciary obligations it makes for an interesting double bind.

Game over for Silicon Valley's Silos

Game Over for Silicon Valley’s Silos

When the DGA and DMA come into force - expected sometime in late 2023 - Europe’s data economy will blossom. From being served by a tiny number of overly powerful, and mainly US advertising and analytics companies, to one in which there are scores of data intermediaries who instead wholesale raw data to a vast range of businesses across a hugely extended ecosystem. 

Analytics and advertising companies will have to compete not on the strength of their monopoly over raw data, but on the competitive capabilities of their products. Data buyers like app builders, hedge funds, researchers, AI/ML companies and public bodies will be able to purchase the best raw data out there, servicing their needs from a market which simply didn’t exist beforehand. At least that is the hope of policy makers.  

But while legislators think they’ve provided the ingredients to make this work, the practical application of all of this is an entirely different question. Plenty of EU initiatives, however comprehensive they appear to be, fail miserably. (GDPR cookie banners anyone?). There are many unknowns as to whether Data Unions can actually scale, or provide enough incentive to the public to ‘get porting’. But if it is realised, Europe’s (and likely the world’s) digital future will be changed forever and the monopolies that have dominated our era will be finished.

Sign up to our email newsletter to keep up with future news and details of Pool events

By Shiv Malik

Shiv Malik is the CEO of Pool, an infrastructure provider for Data Unions

By Shiv Malik