Media & Entertainment

Privacy not a blocker for ‘meaningful’ research access to platform data, says report

Comment

privacy please
Image Credits: Josh hallett (opens in a new window) / Flickr (opens in a new window) under a CC BY 2.0 (opens in a new window) license.

European lawmakers are eyeing binding transparency requirements for Internet platforms in a Digital Services Act (DSA) due to be drafted by the end of the year. But the question of how to create governance structures that provide regulators and researchers with meaningful access to data so platforms can be held accountable for the content they’re amplifying is a complex one.

Platforms’ own efforts to open up their data troves to outside eyes have been chequered to say the least. Back in 2018, Facebook announced the Social Science One initiative, saying it would provide a select group of academics with access to about a petabyte’s worth of sharing data and metadata. But it took almost two years before researchers got access to any data.

“This was the most frustrating thing I’ve been involved in, in my life,” one of the involved researchers told Protocol earlier this year, after spending some 20 months negotiating with Facebook over exactly what it would release.

Facebook’s political Ad Archive API has similarly frustrated researchers. “Facebook makes it impossible to get a complete picture of all of the ads running on their platform (which is exactly the opposite of what they claim to be doing),” said Mozilla last year, accusing the tech giant of transparency-washing.

Facebook, meanwhile, points to European data protection regulations and privacy requirements attached to its business following interventions by the US’ FTC to justify painstaking progress around data access. But critics argue this is just a cynical shield against transparency and accountability. Plus of course none of these regulations stopped Facebook grabbing people’s data in the first place.

In January, Europe’s lead data protection regulator penned a preliminary opinion on data protection and research which warned against such shielding.

“Data protection obligations should not be misappropriated as a means for powerful players to escape transparency and accountability,” wrote EDPS Wojciech Wiewiorówski. “Researchers operating within ethical governance frameworks should therefore be able to access necessary API and other data, with a valid legal basis and subject to the principle of proportionality and appropriate safeguards.”

Nor is Facebook the sole offender here, of course. Google brands itself a ‘privacy champion’ on account of how tight a grip it keeps on access to user data, heavily mediating data it releases in areas where it claims ‘transparency’. While, for years, Twitter routinely disparaged third party studies which sought to understand how content flows across its platform — saying its API didn’t provide full access to all platform data and metadata so the research couldn’t show the full picture. Another convenient shield to eschew accountability.

More recently the company has made some encouraging noises to researchers, updating its dev policy to clarify rules, and offering up a COVID-related dataset — though the included tweets remains self selected. So Twitter’s mediating hand remains on the research tiller.

A new report by AlgorithmWatch seeks to grapple with the knotty problem of platforms evading accountability by mediating data access — suggesting some concrete steps to deliver transparency and bolster research, including by taking inspiration from how access to medical data is mediated, among other discussed governance structures.

The goal: “Meaningful” research access to platform data. (Or as the report title puts it: Operationalizing Research Access in Platform Governance: What to Learn from Other Industries?

“We have strict transparency rules to enable accountability and the public good in so many other sectors (food, transportation, consumer goods, finance, etc). We definitely need it for online platforms — especially in COVID-19 times, where we’re even more dependent on them for work, education, social interaction, news and media consumption,” co-author Jef Ausloos tells TechCrunch.

The report, which the authors are aiming at European Commission lawmakers as they ponder how to shape an effective platform governance framework, proposes mandatory data sharing frameworks with an independent EU-institution acting as an intermediary between disclosing corporations and data recipients.

It’s not the first time an online regulator has been mooted, of course — but the entity being suggested here is more tightly configured in terms of purpose than some of the other Internet overseers being proposed in Europe.

“Such an institution would maintain relevant access infrastructures including virtual secure operating environments, public databases, websites and forums. It would also play an important role in verifying and pre-processing corporate data in order to ensure it is suitable for disclosure,” they write in a report summary.

Discussing the approach further, Ausloos argues it’s important to move away from “binary thinking” to break the current ‘data access’ trust deadlock. “Rather than this binary thinking of disclosure vs opaqueness/obfuscation, we need a more nuanced and layered approach with varying degrees of data access/transparency,” he says. “Such a layered approach can hinge on types of actors requesting data, and their purposes.”

A market research purpose might only get access to very high level data, he suggests. Whereas medical research by academic institutions could be given more granular access — subject, of course, to strict requirements (such as a research plan, ethical board review approval and so on).

“An independent institution intermediating might be vital in order to facilitate this and generate the necessary trust. We think it is vital that that regulator’s mandate is detached from specific policy agendas,” says Ausloos. “It should be focused on being a transparency/disclosure facilitator — creating the necessary technical and legal environment for data exchange. This can then be used by media/competition/data protection/etc authorities for their potential enforcement actions.”

Ausloos says many discussions on setting up an independent regulator for online platforms have proposed too many mandates or competencies — making it impossible to achieve political consensus. Whereas a leaner entity with a narrow transparency/disclosure remit should be able to cut through noisy objections, is the theory.

The infamous example of Cambridge Analytica does certainly loom large over the ‘data for research’ space — aka, the disgraced data company which paid a Cambridge University academic to use an app to harvest and process Facebook user data for political ad targeting. And Facebook has thought nothing of turning this massive platform data misuse scandal into a stick to beat back regulatory proposals aiming to crack open its data troves.

But Cambridge Analytica was a direct consequence of a lack of transparency, accountability and platform oversight. It was also, of course, a massive ethical failure — given that consent for political targeting was not sought from people whose data was acquired. So it doesn’t seem a good argument against regulating access to platform data. On the contrary.

With such ‘blunt instrument’ tech talking points being lobbied into the governance debate by self-interested platform giants, the AlgorithmWatch report brings both welcome nuance and solid suggestions on how to create effective governance structures for modern data giants.

On the layered access point, the report suggests the most granular access to platform data would be the most highly controlled, along the lines of a medical data model. “Granular access can also only be enabled within a closed virtual environment, controlled by an independent body — as is currently done by Findata [Finland’s medical data institution],” notes Ausloos.

Another governance structure discussed in the report — as a case study from which to draw learnings on how to incentivize transparency and thereby enable accountability — is the European Pollutant Release and Transfer Register (E-PRTR). This regulates pollutant emissions reporting across the EU, and results in emissions data being freely available to the public via a dedicated web-platform and as a standalone dataset.

“Credibility is achieved by assuring that the reported data is authentic, transparent and reliable and comparable, because of consistent reporting. Operators are advised to use the best available reporting techniques to achieve these standards of completeness, consistency and credibility,” the report says on the E-PRTR.

“Through this form of transparency, the E-PRTR aims to impose accountability on operators of industrial facilities in Europe towards to the public, NGOs, scientists, politicians, governments and supervisory authorities.”

While EU lawmakers have signalled an intent to place legally binding transparency requirements on platforms — at least in some less contentious areas, such as illegal hate speech, as a means of obtaining accountability on some specific content problems — they have simultaneously set out a sweeping plan to fire up Europe’s digital economy by boosting the reuse of (non-personal) data.

Leveraging industrial data to support R&D and innovation is a key plank of the Commission’s tech-fuelled policy priorities for the next five+ years, as part of an ambitious digital transformation agenda.

This suggests that any regional move to open up platform data is likely to go beyond accountability — given EU lawmakers are pushing for the broader goal of creating a foundational digital support structure to enable research through data reuse. So if privacy-respecting data sharing frameworks can be baked in, a platform governance structure that’s designed to enable regulated data exchange almost by default starts to look very possible within the European context.

“Enabling accountability is important, which we tackle in the pollution case study; but enabling research is at least as important,” argues Ausloos, who does postdoc research at the University of Amsterdam’s Institute for Information Law. “Especially considering these platforms constitute the infrastructure of modern society, we need data disclosure to understand society.”

“When we think about what transparency measures should look like for the DSA we don’t need to reinvent the wheel,” adds Mackenzie Nelson, project lead for AlgorithmWatch’s Governing Platforms Project, in a statement. “The report provides concrete recommendations for how the Commission can design frameworks that safeguard user privacy while still enabling critical research access to dominant platforms’ data.”

You can read the full report here.

More TechCrunch

If you’ve ever bought a sofa on an online store, have you thought about the homes that you can see in the background? When it’s time to release a new…

Presti uses generative AI to improve product photography in the furniture industry

Google has joined investors backing Moving Tech, the parent firm of open-source ride-sharing app Namma Yatri in India that is eroding market share from Uber and Ola with its no-commission…

Google backs Indian open-source Uber rival

These messaging features, announced at WWDC 2024, will have a significant impact on how people communicate every day.

At last, Apple’s Messages app will support RCS and scheduling texts

iOS 18 will be available in the fall as a free software update.

Here are all the devices compatible with iOS 18

The tests indicate there are loopholes in TikTok’s ability to apply its parental controls and policies effectively in a situation where the teen user originally lied about their age, as…

TikTok glitch allows Shop to appear to users under 18, despite adults-only policy

Lhoopa has raised $80 million to address the lack of affordable housing in Southeast Asian markets, starting with the Philippines.

Lhoopa raises $80M to spur more affordable housing in the Philippines

Former President Donald Trump picked Ohio Senator J.D. Vance as his running mate on Monday, as he runs to reclaim the office he lost to President Joe Biden in 2020.…

Trump’s VP candidate JD Vance has long ties to Silicon Valley, and was a VC himself

Hello and welcome back to TechCrunch Space. Is it just me, or is the news cycle only accelerating this summer?!

TechCrunch Space: Space cowboys

Apple Intelligence features are not available in the developer beta, which is out now.

Without Apple Intelligence, iOS 18 beta feels like a TV show that’s waiting for the finale

Apple released the public betas for its next generation of software on the iPhone, Mac, iPad and Apple Watch on Monday. You can now test out iOS 18 and many…

Apple’s public betas for iOS 18 are here to test out

One major dissenter threatens to upend Fisker’s apparent best chance at offloading its unsold EVs, a deal that would keep the startup’s bankruptcy proceeding alive and pave the way for…

Fisker has one major objector to its Ocean SUV fire sale

Payments giant Stripe has delayed going public for so long that its major investor Sequoia Capital is getting creative to offer returns to its limited partners. The venture firm emailed…

Major Stripe investor Sequoia confirms $70B valuation, offers its investors a payday

Alphabet, Google’s parent company, is in advanced talks to acquire Wiz for $23 billion, a person close to the company told TechCrunch. The deal discussions were previously reported by The…

Google’s Kurian approached Wiz, $23B deal could take a week to land, source says

Name That Bird determines individual members of a species by identifying distinguishing characteristics that most humans would be hard-pressed to spot.

Bird Buddy’s new AI feature lets people name and identify individual birds

YouTube Music is introducing two new ways to boost song discovery on its platform. YouTube announced on Monday that it’s experimenting with an AI-generated conversational radio feature, and rolling out…

YouTube Music is testing an AI-generated radio feature and adding a song recognition tool

Tesla had internally planned to build the dedicated robotaxi and the $25,000 car, often referred to as the Model 2, on the same platform.

Elon Musk confirms Tesla ‘robotaxi’ event delayed due to design change

What this means for the space industry is that theory has become reality: The possibility of designing a habitation within a lunar tunnel is a reasonable proposition.

Moon cave! Discovery could redirect lunar colony and startup plays

Get ready for a prime week of savings at TechCrunch Disrupt 2024 with the launch of Disrupt Deal Days! From now to July 19 at 11:59 p.m. PT, we’re going…

Disrupt Deal Days are here: Prime savings for TechCrunch Disrupt 2024!

Deezer is the latest music streaming app to introduce an AI playlist feature. The company announced on Monday that a select number of paid users will be able to create…

Deezer chases Spotify and Amazon Music with its own AI playlist generator

Real-time payments are becoming commonplace for individuals and businesses, but not yet for cross-border transactions. That’s what Caliza is hoping to change, starting with Latin America. Founded in 2021 by…

Caliza lands $8.5 million to bring real-time money transfers to Latin America using USDC

Adaptive is a platform that provides tools designed to simplify payments and accounting for general construction contractors.

Adaptive builds automation tools to speed up construction payments

When VanMoof declared bankruptcy last year, it left around 5,000 customers who had preordered e-bikes in the lurch. Now VanMoof is up and running under new management, and the company’s…

How VanMoof’s new owners plan to win over its old customers

Mitti Labs aims to transform rice farming in India and other South Asian markets by reducing methane emissions by 50% and water consumption by 30%.

Mitti Labs aims to make rice farming less harmful to the climate, starting in India

This is a guide on how to check whether someone compromised your online accounts.

How to tell if your online accounts have been hacked

There is a general consensus today that generative AI is going to transform business in a profound way, and companies and individuals who don’t get on board will be quickly…

The AI financial results paradox

Google’s parent company Alphabet might be on the verge of making its biggest acquisition ever. The Wall Street Journal reports that Alphabet is in advanced talks to acquire Wiz for…

Google reportedly in talks to acquire cloud security company Wiz for $23B

Featured Article

Hank Green reckons with the power — and the powerlessness — of the creator

Hank Green has had a while to think about how social media has changed us. He started making YouTube videos in 2007 with his brother, novelist John Green, at a time when the first iPhone was in development, Myspace was still relevant and Instagram didn’t exist. Seventeen years later, posting…

Hank Green reckons with the power — and the powerlessness — of the creator

Here is a timeline of Synapse’s troubles and the ongoing impact it is having on banking consumers. 

Synapse’s collapse has frozen nearly $160M from fintech users — here’s how it happened

Featured Article

Helixx wants to bring fast-food economics and Netflix pricing to EVs

When Helixx co-founder and CEO Steve Pegg looks at Daisy — the startup’s 3D-printed prototype delivery van — he sees a second chance. And he’s pulling inspiration from McDonald’s to get there.  The prototype, which made its global debut this week at the Goodwood Festival of Speed, is an interesting proof…

Helixx wants to bring fast-food economics and Netflix pricing to EVs

Featured Article

India clings to cheap feature phones as brands struggle to tap new smartphone buyers

India is struggling to get new smartphone buyers, as millions of Indians don’t go for an upgrade and continue to be on feature phones.

India clings to cheap feature phones as brands struggle to tap new smartphone buyers