How CrowdTangle predicted the future

Meta is killing off a pioneering tool for platform transparency — but the movement it kickstarted is just beginning

How CrowdTangle predicted the future
"a picture with a magnifying glass looking over Facebook" / DALL-E

Eight years after buying the data analysis tool CrowdTangle, and four years after it began exasperating the executives responsible for it, Meta said Thursday that it would shut down the tool on August 14. 

CrowdTangle, a data analysis tool that allowed researchers, journalists, and other members of civil society to understand the spread of content on Meta’s platforms, was never among the company’s most popular products. But it was perhaps the first major tool offered by a big social network to let the public analyze trends on the platform in real time.

With CrowdTangle now set to disappear — just 12 weeks before the US presidential election — Meta is encouraging researchers to instead use its Meta Content Library. But while the content library improves on CrowdTangle in some key ways, it’s also much less broadly available than its predecessor was. 

The question now is whether the visibility that CrowdTangle once brought to Facebook can be preserved and expanded at Meta and other platforms — or whether, given all the unwelcome scrutiny it brought, platforms will work to avoid the real-time transparency it offered.

I. The history

CrowdTangle was founded in 2012 by Brandon Silverman and Matt Garmur. Their original idea had been to build tools for online activists on Facebook. When that didn’t work out, they turned their attention to building tools for publishers. Their first successful product was a dashboard that showed you which posts on Facebook pages were getting the most engagement.

It was the right tool at the right time. Facebook was then arguably the most important company in journalism, funneling millions of pageviews to a wide array of digital publishers. By using CrowdTangle, publishers could understand what worked on the platform instantly, and could adjust their output accordingly. 

This did not necessarily lead to better journalism. Instead, it led to a kind of spreading sameness around the web, as publishers began using CrowdTangle dashboards to identify their competitors’ posts and quickly fire off their own versions. (If you wonder why every single publisher posted Game of Thrones trailers in the 2010s, no matter what their websites were nominally about, CrowdTangle is a big reason why.)

In November 2016 — days after Donald Trump had won the presidency — Facebook bought CrowdTangle for an undisclosed sum. The move came arguably at the peak of Facebook’s interest in journalism. Days earlier, damning stories about the platform’s promotion of election misinformation began to batter the company, sparking a yearslong backlash that only recently has begun to wane. Facebook began to reduce the distribution of news in its apps, kicking off a long divorce that culminated recently in the announcement last month that the company would deprecate its news tab in the United States and elsewhere and would no longer make commercial deals with publishers.

Along the way, though, CrowdTangle had come to serve a very different purpose than its founders had originally intended. It turned out that a tool that shows which stories are spreading most quickly on the platform is useful for more than just seeing which low-effort stories to copy. It’s also useful for researchers and journalists tracking the spread of false narratives and other potentially harmful content. 

In 2020, New York Times columnist Kevin Roose used CrowdTangle data to create a bot that each day tweeted the stories getting the most engagement on public pages in the United States. (Roose is my friend and we now co-host the Hard Fork podcast together.) The bot found that, contrary to the then-popular narrative among conservatives that Facebook worked to suppress their reach, the most popular posts on the platform were often right-leaning

Facebook executives hated this bot. Advertisers and policymakers were calling them up asking about Roose’s tweets. Executives tried to explain that the bot measured only engagement. The fairer way to measure the spread of content on Facebook, they said, is reach: how many people view a post, rather than how many share or comment on it. When you measure by reach, executives said, the news posts that most people on Facebook see are much more centrist.

That might be true, but Facebook did not regularly make that data public. In any case, it put Facebook in the difficult position of telling journalists and policymakers to ignore the data generated by its own tool. Suddenly, CrowdTangle had become perhaps the weirdest product in Facebook’s history: beloved by its users, disavowed by its owners. 

Something had to give. And indeed, in the summer of 2021, Meta dissolved the team that ran CrowdTangle. The following year, it stopped accepting new users.

"CrowdTangle’s only sin was being too useful," Roose told me today. "Without journalists having access to this data, the public will have a poorer understanding of what happens on Meta’s platforms — which is exactly why Meta executives decided to kill it."

II. The library

Meta might have simply shut down CrowdTangle in 2022, had it not been for a regulatory wrinkle: the European Union’s Digital Services Act. Article 40 of the act requires very large platforms and search engines to share publicly available data with researchers and nonprofit groups, so long as it contributes to “the detection, identification and understanding of systemic risks in the Union.”

Meta is considered a very large platform under the DSA. And in November it unveiled the Meta Content Library, which is designed to fulfill its obligations under European law.

Nick Clegg, Meta’s president of global affairs, said in an interview this week that the Meta Content Library is a better tool for researchers than CrowdTangle in almost every way. For starters, it includes data about reach, which he said offers a better picture of what content on the platform is most popular. It also includes access to types of content that were never available in CrowdTangle, including comments and the Reels short-form video format. 

“Of course I understand that people who are wedded to the use of that tool will maybe shed a tear that it's been deprecated,” Clegg said. “But it's being more than replaced, actually. It's being entirely outpaced by the new tools that we're making available: the Meta Content Library — the UI and the API — which is just a much, much more robust and much more sophisticated tool to do what I think people reasonably expect us to do, which is to allow researchers to really … lift the hood and really look into the bowels of our system.”

Access to the library is governed by the independent Inter-university Consortium for Political and Social Research at the University of Michigan (ICPSR). At the moment, the number of researchers with access to the library is in the hundreds. But Clegg said he hopes that figure grows considerably over time. By the end of this month, he said, Meta hopes to have 1,000 fact-checkers using the library. 

At the same time, aside from fact checkers, journalists won’t have direct access to the library. If they want to perform analysis on Facebook data in the future, they’ll need to partner with a nonprofit group or research group that does have it.

CrowdTangle founder Silverman told me that it is likely more important that civil society have access to real-time platform data than academic researchers. The reason, he said, is that academic research takes so long. Meta granted some researchers access to data related to the 2020 US presidential election for study; the first papers on the subject weren’t published until July 2023.

“The main value of systems like this is providing a kind of observability function for civil society,” Silverman said in an interview. “The academic stuff is just too slow.”

III. The future

The risk of Meta’s approach to transparency with the library is that while the data is much better, access to it is much more limited, and any insights to be gotten from it will take much longer to arrive. That would blunt the benefit of any improvements within the library itself, Silverman said. 

Meanwhile, researchers who have used both CrowdTangle and the content library say that they have different strengths and weaknesses. 

Naomi Shiffman, head of data and implementation at the Oversight Board, told me that the library “has the potential to be better than CrowdTangle” over time. That’s important for the board, Meta’s independent, quasi-judicial branch, which has the final word on what posts stay up or come down and which advises it on policy matters. When Meta won’t give the board the data it requests, it often uses CrowdTangle as a last resort, as we reported last month

Still, she said, “there are areas where CrowdTangle is still significantly better – Instagram coverage is currently more comprehensive in CrowdTangle than the content library, and CrowdTangle lets you search phrases, not just individual keywords.” 

CrowdTangle also helps the board understand how quickly a post is going viral by showing how much engagement it gets in 15-minute increments. That lets researchers compare posts by how quickly they spread as opposed to their total engagement, she said.

“We’ve heard their team is working on improving Instagram coverage and incorporating phrase search as a product feature, but until that happens, it will be a challenge to pull the right data while operating within the limits of the tool,” Shiffman said. 

Despite those challenges, Silverman told me that he remains optimistic about the future of platform transparency. Before the DSA, we had to rely on companies volunteering to give researchers access. Today, they must do so by law. The result is that researchers now have a legal right to request data from platforms that have never offered it before, including Google, YouTube, LinkedIn, and Snapchat. 

“What we really wanted was also transparency across the entire industry,” Silverman said. “I just don’t think we can rely on volunteer efforts in this space.” 

It might have been nice to live in a world where both CrowdTangle and the content library, with their various strengths and weaknesses, were available. But Silverman said he was content knowing that the idea CrowdTangle came to stand for — that there should be some level of public access to these private repositories of speech — had since been enshrined into law. 

“In the end, it’s a trade I’m actually happy with,” he said.  

Correction, 3/15: This post originally said there were about 100 researchers using the Meta content library. The number is actually several hundred.


More on CrowdTangle: Silverman offers more thoughts on the service on his Substack.


How to apply to for research access to very large online platforms under the DSA: Just follow the instructions at these links!


Sponsored

Investors are focused on these metrics.

Startups should take notice.

It takes more than a great idea to make your ambitions real. That’s why Mercury goes beyond banking* to share the knowledge and network startups need to succeed. In this article, they shed light on the key metrics investors have their sights set on right now.

Even in today’s challenging market, investments in early-stage startups are still being made. That’s because VCs and investors haven’t stopped looking for opportunities — they’ve simply shifted what they are searching for. By understanding investors’ key metrics, early-stage startups can laser-focus their next investor pitch to land the funding necessary to take their company to the next stage.

Read the full article to learn how investors think and how you can lean into these numbers today.

*Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group and Evolve Bank & Trust®; Members FDIC.


On the podcast this week: Kevin and I talk through the politics of a TikTok ban. Then, a critique of Kate Middleton's Photoshop skills. And finally, the Times' Kashmir Hill joins us to discuss how our cars started snitching on us.

Apple | Spotify | Stitcher | Amazon | Google | YouTube


TikTok


Governing

How growing up in a world of smartphones negatively impacts human development. (Jonathan Haidt / The Atlantic)


Industry


Those good posts

For more good posts every day, follow Casey’s Instagram stories.

(Link)

(Link)

(Link)


Talk to us

Send us tips, comments, questions, and old CrowdTangle dashboards: casey@platformer.news and zoe@platformer.news.