Facebook, data sharing, and broken promises

Note: This was originally published as the daily newsletter for the Columbia Journalism Review, where I am the chief digital writer

Meta, the parent company of Facebook, said on Monday that it plans to share more data about political ad targeting on its platform with social scientists and other researchers, as part of what the company calls its Open Research and Transparency project. According to CNN, Meta will provide “detailed targeting information for social issue, electoral or political ads” on the platform to “vetted academic researchers.” Jeff King, Meta’s vice president of business integrity, said in a statement that the information could include the different categories of interest that were used to target users, such as environmentalism or travel. Starting in July, the New York Times reported, the company’s publicly available advertising library will include a summary of this targeting information, including a user’s location. King said that by sharing the data, Meta hoped “to help people better understand the practices used to reach potential voters on our technologies.”

Monday’s announcement, including King’s reassurance, gave the impression that Meta wants to be as transparent as possible about its ad targeting and other data-related practices. Researchers who have dealt with the platform in the past tell a different story, however, including Nathaniel Persily, a law professor at Stanford who co-founded and co-chaired Social Science One, a highly touted data-sharing partnership with Facebook that Persily said he resigned from in frustration. Persily and others say they have spent years trying to get Meta to provide even the smallest amount of useful information for research purposes, but even when the company does so, the data is either incomplete—Meta admitted last year that it supplied researchers with faulty data, omitting about 40 percent of its user base—or the restrictions placed on how it can be used are too onerous. In either case, researchers say the resulting research is almost useless.

In some cases, Meta has shut down potentially promising research because the process didn’t comply with its rules. Last August, the company blocked an NYU research effort called the Ad Observatory, part of the Cybersecurity for Democracy Project, because it said the group was using a browser extension to “scrape” information from Facebook without the consent of users. The company not only blocked the research group from getting any data, but also shut down the researchers’ personal accounts. Laura Edelson, a post-doctoral researcher at New York University who worked on the project, and Damon McCoy, an associate professor of computer science and engineering at NYU, wrote in Scientific American that Facebook wants people to see it as transparent, “but in reality, it has set up nearly insurmountable roadblocks for researchers seeking shareable, independent sources of data” (Edelson also talked with CJR last September about the shutdown of her research and the implications for social science.)

In a discussion on CJR’s Galley platform, and in an essay written with a colleague and published by the Brookings Institution in December, Persily described how the process of trying to extract information from Meta even through official channels—that is, through the Social Science One partnership—was so frustrating and time-consuming that he eventually became convinced the only way forward was to force Meta to provide data by applying legislative pressure. This led to the creation of the Platform Accountability and Transparency Act, which was put forward in December by Democratic senators Chris Coons, Rob Portman, and Amy Klobuchar. Persily explained that the bill would compel Meta and other platforms to share data with researchers, would protect those researchers from liability for the use of the data, and would force companies to “make public certain non-private data related to content, advertising, and algorithms.”

Interest in research about Facebook and its impact on society accelerated with the release of the Facebook Papers in October, a trove of internal documents that Frances Haugen, a whistleblower and former Facebook project manager, leaked to the media. Some of the documents appeared to show that Facebook’s researchers knew about the platform’s ill effects, but senior management chose not to fix them. Susan Benesch, founding director of the Dangerous Speech Project, wrote in The Atlantic that one takeaway from the Facebook Papers was that the company should be compelled to release more data to external researchers. “If Facebook employees continue to be the only ones who can monitor Facebook, neither they nor anyone else can make sufficient progress in understanding how toxic content thrives on social-media platforms, how it leads to human suffering, and how best to diminish that suffering,” Benesch argued.

As Persily explained in his Brookings essay, convincing Facebook or Meta to share data with outsiders was difficult even before 2018, but it became exponentially more difficult after controversy arose over the practices of Cambridge Analytica, a consulting firm that acquired personal data on millions of Facebook users—data that originally came from a researcher who had been given access to the platform. The scandal “further chilled any platform efforts to make it easy for outside researchers to gain access to individual-level content,” Persily wrote. It also resulted in a $5 billion settlement with the Federal Trade Commission, in which Facebook promised to exercise better care over the personal information of its users. Meta used this settlement as an excuse for why it shut down Laura Edelson’s NYU research project, saying the data scraping breached its commitment to protect data. That justification failed to hold water, however, after the FTC said the settlement didn’t prevent Meta from sharing data with researchers.

Cambridge Analytica may suddenly be even more present in the minds of Meta executives now than it has been: Karl Racine, Attorney-General of the District of Columbia, said Monday he is suing Mark Zuckerberg, Meta’s CEO, for failing to prevent personal information on Facebook users from being misused by the firm. The lawsuit alleges Zuckerberg oversaw the discussions that led to the breach, and therefore he was “personally involved in Facebook’s failure to protect the privacy and data of its users leading directly to the Cambridge Analytica incident,” Racine said in a statement. He added that the breach led to a “multi-year effort to mislead users about the extent of Facebook’s wrongful conduct,” and that the suit “sends a message that corporate leaders, including CEOs, will be held accountable for their actions.” Politico explained that the case is similar to one that Racine launched against Meta in 2018, which is still ongoing.

Here’s more on Facebook:

France and data: Arcom, France’s media regulator, launched a public consultation on Wednesday on the topic of researchers’ access to data from online platforms. “This consultation aims to collect the expectations and needs of researchers, associations, journalists, etc., in the design of their research work on online platforms,” the regulator said in a news release. Article 31 of the European Union’s new Digital Services Act—which is expected to go into effect next year—regulates access by researchers to data from large digital platforms such as Facebook and Google.

COVID black box: Recode wrote last year about how the lack of data from Facebook was making it difficult to determine the impact of COVID misinformation, which President Biden said was putting people at risk by confusing them about the efficacy of vaccines. The reality, Recode said, is that “we simply don’t know” what the real impact of that misinformation is, because the platform won’t provide enough data. “Right now, we’re guessing [on] a lot of stuff,” Katherine Ognyanova, an associate professor of communications at Rutgers University who co-leads the Covid States project, told Recode. “We can ask people questions. But Facebook truly has the data about what people have seen and how their attention is being devoted on the platform.”

Transparency: In an op-ed essay published in the New York Times, Frances Haugen—the whistleblower who last October leaked internal documents about Facebook’s weak efforts at content moderation—wrote that “Europe is making social media better without restricting free speech [and] the US should too.” The Digital Services Act, which is expected to take effect next year in the European Union, is “a landmark agreement to make social media less toxic for users,” Haugen wrote. The new standards outlined in the bill, she argued, “will for the first time pull back the curtain on the algorithms that choose what we see and when we see it in our feeds,” in part because of the “ethic of transparency” on which they are based. Facebook’s existing poorly implemented content-moderation strategies, Haugen wrote,” leave those most at risk of real world violence unprotected.”

GDPR fail: The introduction of the General Data Protection Regulation (GDPR) in Europe in 2018 was expected by some to usher in a new era of penalizing the major platforms for improper use of personal information, but that doesn’t seem to have happened, Matt Burgess writes in Wired magazine. “One thousand four hundred and fifty-nine days have passed since data rights nonprofit NOYB fired off its first complaints under Europe’s flagship data regulation, GDPR,” he writes. The complaints allege that Google, WhatsApp, Facebook, and Instagram forced people into giving up their data without obtaining proper consent, but four years later, the group is still waiting for a final decision to be made in the case, and “it’s not the only one,” says Burgess.

Other notable stories:

The Financial Times reported that Elon Musk will have to raise more cash in order to finance his proposed $44 billion acquisition of Twitter, after a $6.25 billion loan arrangement that was backed by his shares of electric carmaker Tesla expired. “After ditching the margin loan, the amount of equity that Musk must secure to complete the deal now stands at $33.5 billion,” the Times reported, based on a regulatory filing Wednesday. I wrote recently for CJR about how Musk said the Twitter deal was “on hold” until he could confirm that the company’s estimate of spam and fake accounts was correct, and some speculated he said this because he wanted out of the deal. In other Twitter news, the company agreed to pay $150 million to settle allegations that it misused private information, like phone numbers, to target advertising.

For Poynter, Alex Mahadevan interviewed Justin Peden, a 20-year-old journalism student in Alabama who has been using a Twitter account called The Intel Crab to share open-source intelligence and reporting about violence in Ukraine for the past several years. The account was anonymous until earlier this year, when Peden revealed his identity. The open-source intelligence or OSINT community is “the Wild West in many ways,” he told Poynter. “This is an unfortunate downside to the community, and something more relevant than ever as the war continues. I’m no longer anonymous for one simple reason. I want to be held accountable for my work, positively or negatively.”

A study of journalists in Canada found what researchers called “alarming levels of stress,” according to a news release from the Canadian Journalism Forum on Violence and Trauma. It found mental health issues at far above the average rate, with 69 percent of those surveyed saying they suffer from anxiety, 46 percent reporting depression, and 15 percent saying they had symptoms of post-traumatic stress disorder. One in ten of those surveyed said they had thought about suicide. The study was based on 1,251 survey responses from freelancers, editors, frontline reporters, video journalists, and news executives.

Independent media failed to gain wider public trust in Russia because they did not try “to reach those that we don’t know,” according to Evgeniya Dillendorf, the former UK correspondent for Novaya Gazeta, an independent Russian newspaper that won a Nobel Prize for its reporting, but has since been forced to suspend publishing due to new Russian media laws that make it a crime to distribute “misinformation” about the war in Ukraine. Dillendorf told Press Gazette that while she admired the leadership at Novaya Gazeta, “there was sometimes this kind of fashion to be indifferent, to be above the fray, not to interfere [and] I think that was a great mistake.”

Rick Edmonds, a media business columnist at Poynter, said he was “wrong, wrong, wrong” when he wrote in 2019 that USA Today would likely kill its print version within two years. Edmonds said he asked Maribel Perez Wadsworth, publisher of USA Today, what the business case is for continuing to put out a print paper. “The print edition of USA Today is a profitable product,” she said, “which we will continue to publish as long as readers want it.” Paid printed copies distributed at hotels—which Edmonds says are down by more than 90 percent since 2019—remain “a nice billboard for USA Today,” Wadsworth said. “People think about it, even if they don’t take a copy.”

A new website that published leaked emails from leading proponents of Britain’s exit from the European Union is tied to Russian hackers, according to a report from Reuters that was based on interviews with a Google cybersecurity official and the former head of UK foreign intelligence. “The website – titled ‘Very English Coop d’Etat’ – says it has published private emails from former British spymaster Richard Dearlove, leading Brexit campaigner Gisela Stuart, pro-Brexit historian Robert Tombs, and other supporters of Britain’s divorce from the EU, which was finalized in January 2020,” Reuters wrote.

According to a report from the Globe and Mail, millions of students in 49 countries had their personal information sent to advertisers and data brokers after schools made the switch to online learning during the COVID-19 pandemic. The Globe says it collaborated with 12 other media organizations to get access to data from Human Rights Watch, which “alleges online education platforms actively or passively infringed upon children’s rights by collecting and sharing their personal information, such as their locations and web browsing histories.” The Globe said the investigative collaboration was co-ordinated by the Signals Network, a French-American non-profit organization that supports whistle-blowers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: