How Google’s 2021 AI ethics debate foreshadowed the future

Two years ago, AI researchers published a hot-button research paper on the tech behind Bard, ChatGPT, and more.

March 7, 2023

• 4 min read

What happens when you bring together the NiCEst person on earth and the best AI platform for customer service? You get service that just gets you. Kristen Bell + the smartest AI platform = service that’s seamless, personal, efficient… it’s so NiCE.

After Google debuted its new AI chatbot, Bard, something unexpected happened: After the tool made a mistake in a promotional video, Google’s shares dropped $100 billion in one day.

Criticism of the tool’s reportedly rushed debut harks back to an AI ethics controversy at Google two years ago, when the company’s own researchers warned about the development of language models moving too fast without robust, responsible AI frameworks in place.

In 2021, the technology became central to an internal-debate-turned-national-headline after members of Google’s AI ethics team, including Timnit Gebru and Margaret Mitchell,wrote a paper on the dangers of large language models (LLMs). The research paper—called “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜”—set off a complex chain of events that led to both women being fired and eventually, the restructuring of Google’s responsible AI department. Two years later, the concerns the researchers raised are more relevant than ever.

“The Stochastic Parrots paper was pretty prescient, insofar as it definitely pointed out a lot of issues that we’re still working through now,” Alex Hanna, a former member of Google’s AI ethics team who is now director of research at the Distributed AI Research Institute founded by Gebru, told us.

Since the paper’s publication, buzz and debate about LLMs—one of the biggest AI advances in recent years—have gripped the tech industry and the business world at large. The generative AI sector raised $1.4 billion last year alone, according to Pitchbook data, and that doesn’t include the two megadeals that opened this year between Microsoft and OpenAI and Google and Anthropic.

“Language technologies are becoming this measure of…AI dominance,” Hanna said. Later, she added, “[It’s] kind of a new AI arms race.”

Emily M. Bender, one of the authors of the research paper on LLMs, told us she somewhat anticipated tech companies’ big plans for the technology, but didn’t necessarily foresee its increased use in search engines. Gebru shared similar thoughts in a recent interview with the Wall Street Journal.

“What we saw, even as we were starting to write the Stochastic Parrots paper, was that all of the big actors in the space seemed to be putting a lot of resources into scaling—and…betting on, ‘All we need here is more and more scale,’” Bender told us.

For its part, Google may have predicted that increasing importance ahead of its decision to invest heavily in LLMs, but so, too, did the company’s AI ethicists predict potential pitfalls of development without rigorous analysis, such as biased output and misinformation.

Google isn’t alone in working through these issues. Days after Microsoft folded ChatGPT into its Bing search engine, it was criticized for toxic speech, and since then, Microsoft has reportedly tweaked the model in an attempt to avoid problematic prompts. The EU is reportedly exploring AI regulation as a way to address concerns about ChatGPT and similar technologies, and US lawmakers have raised questions.

Google declined to comment on the record for this piece.

Mitchell recalled predicting the tech would advance quickly, but didn’t necessarily foresee its level of public popularity.

“I saw my role as…doing basic due diligence on a technology that Google was heavily invested in and getting more invested in,” Mitchell, who is now a researcher and chief ethics scientist at HuggingFace, told us. Later, she added, “The reason why we were pushing so hard to publish the Stochastic Parrots paper is because we saw where we were in terms of the timeline of that technology. We knew that it would be taking off that year, basically—was already starting to take off—and so the time to introduce the issues into the discussion was right then.”

For her part, Bender ultimately sees LLMs as a “fundamentally flawed” technology that, in many contexts, should never be deployed.

“A lot of the coverage talks about them as not yet ready or still underdeveloped—something that suggests that this is a path to something that would work well, and I don’t think it is,” Bender told us, adding, “There seems to be, I would say, a surprising amount of investment in this idea…and a surprising eagerness to deploy it, especially in the search context, apparently without doing the testing that would show it’s fundamentally flawed.”

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.