Google

Google’s AI search setback

The AI Overviews debacle and leaked search ranking documents tell a common story about the web's future — and it's not pretty

Casey Newton

May 28, 2024 — 9 min read

CEO Sundar Pichai speaks at Google I/O earlier this month. (Christoph Dernbach / Getty Images)

Over the weekend, the AI Overviews that Google announced at its developer conference made international headlines — but not for the reasons the company hoped for.

Across Threads, Bluesky, and X, users encountering the company’s AI-generated summaries atop search results found over and over again that Google was hallucinating or worse.

Most famously, there was the result that suggested putting nontoxic glue in your pizza. But AI overviews also suggested putting gasoline in your spaghetti. And its sense of American history appeared deeply broken; it reported that just 17 American presidents were white, and that one was Muslim.

I was able to confirm that AI overviews were suggesting that people eat one to three rocks per day, an idea that turns out to have come from … The Onion.

The fact that many of the most viral screenshots of AI overviews were fake seemed, for once, beside the point. When Google is recommending that you should eat a rock every day, almost any search result shared on social media seems plausible enough. That’s the whole problem!

In the moment, all of this felt funnier than it did scary. But it also revealed the emptiness of Google’s new approach to search. Without any knowledge base of its own, the company’s large language model simply summarizes and regurgitates what it finds on the web according to unknown criteria — an approach Today in Tabs’ Rusty Foster accurately calls automated plagiarism.

Google blamed all this on its users, Kylie Robison reported at The Verge:

Google spokesperson Meghann Farnsworth said the mistakes came from “generally very uncommon queries, and aren’t representative of most people’s experiences.” The company has taken action against violations of its policies, she said, and are using these “isolated examples” to continue to refine the product.

On one hand, some of these queries clearly were quite uncommon. “Can I use gasoline to make spaghetti” probably did not come up during internal red-teaming exercises. The whole point of gradually rolling out big changes to search is to identify where it’s broken.

On the other hand … plenty of these queries were common enough. Asking about the race or religion of US presidents, or how to get cheese to stick to pizza, are straightforward uses of Google that the previous, non-AI-degraded version of the search engine handled just fine. The company could have chosen to roll out AI overviews in a few narrow categories. But instead it went broader, and now poor Katie Notopoulos is eating glue pizza for pageviews.

I expect that the quality of Google’s AI results will improve over time; it’s an existential issue for the company, and if it can’t make AI search work, someone else will. (The company could probably get a long way just by removing The Onion from its search engine’s news sources. In the meantime, I can report that as of today the company is no longer pushing a rock diet.)

But even then, Foster’s criticism will still stand: those “overviews” really are just slightly reworded versions of journalists’ copy, designed to give people ever fewer reasons to step outside Google’s walled garden. This is what I mean when I say that the web has entered a state of managed decline: one company has outsized influence over when and how people visit any websites at all, and it has told us it plans to gradually ratchet those visits down by continuing to answer more questions on the search engine results page.

And to the extent that it moves slowly, or occasionally pauses and temporarily reverses course, it will be because doing so benefits Google, rather than any of the sites and businesses that have come to rely on it. The company said last week that it is preparing to show ads in AI overviews, as we always knew it would.

While we wait for any of this to get better, it seems worth noting that this is arguably Google’s third significant botched launch of an AI product.

Bard, the predecessor to Google’s Gemini chatbot, debuted in February 2023. When it did, a demonstration incorrectly stated that the James Webb Space Telescope “took the very first pictures of a planet outside of our own solar system.” It did not, and the incident was one of the first prominent cases of an LLM hallucinating on a global stage.

Then in February of this year, Google’s Gemini chatbot refused to make images of white people in many cases, resulting in racially diverse Nazis and Founding Fathers. After an outcry, particularly in conservative circles, Google removed image generation from the bot.

Each of those was embarrassing in its own way. And yet — it also seems obviously worse to tell people to eat rocks or make spaghetti with gasoline. In that respect, the most important story about Google’s AI launches is that they are deteriorating over time.

When the Wall Street Journal tested the big chatbots across a wide variety of criteria, it ranked Google third, after the upstart Perplexity and OpenAI’s ChatGPT. (Anthropic’s Claude and Microsoft’s Copilot ranked fourth and fifth, respectively.)

There is still a lot we don’t know about how large language models work. There is even more we don’t know about how Google’s moves here will change the future of the internet. A web that thrived because of its openness and decentralization has now begun to wither.

On Tuesday, people who work on search engine optimization raced to read about thousands of pages of documentation regarding the company’s search engine algorithm that appear to have been accidentally published online. Google closely guards information about search ranking, both for integrity reasons (to prevent bad actors from manipulating results) and competitive ones (to maintain its edge over rivals). And so the SEO experts who got an early look at the documents are calling them a bonanza.

No one has yet fully digested the contents of the leak, and Google has not commented on the documents’ authenticity. Some of the systems referenced may no longer be operating. Assuming the documents are real, though, I was struck by the first conclusion drawn from them by Rand Fishkin, who published the first report on the leaks. Surveying the documents, he concludes that Google’s organic search rankings have come to favor large, dominant brands over everything else.

“They’ve been on an inexorable path toward exclusively ranking and sending traffic to big, powerful brands that dominate the web [over] small, independent sites and businesses,” he writes.

AI overviews, of course, are intended to work the same way: identifying the relatively few credible publishers left on the web, then compressing their collective output into a slurry that can be served up in the place where search results once appeared. The trend is away from an open web where anyone can compete to a world with a smaller number of big winners. For the moment, that benefits large publishers. In time, though, it may favor only one publisher: Google itself.

In that way, the story of the AI overviews debacle and the story of the search ranking leaks are the same. Each shows Google moving awkwardly toward the place it has been moving for years now. And at the moment it’s not clear what anyone can do about it.

Sponsored

Simplify your startup’s finances with Mercury

As a founder of a growing startup, you’re focused on innovating to attract customers, pinpointing signs of early product-market fit, and securing the funds necessary to grow. Navigating the financial complexities of a startup on top of it all can feel mystifying and incredibly overwhelming. More than that, investing time into becoming a finance expert doesn’t always promise the best ROI.

Mercury’s VP of Finance, Dan Kang, shares the seven areas of financial operations to focus on in The startup guide to simplifying financial workflows. It details how founders and early teams can master key aspects, from day-to-day operations like payroll to simple analytics for measuring business performance. Read the full article to learn the art of simplifying your financial operations from the start.

Read the Article

*Mercury is a financial technology company not a bank. Banking services provided by Choice Financial Group and Evolve Bank & Trust®; Members FDIC. Platformer has been a Mercury customer since 2020.