How three startups working with large language models are thinking about ethics

The powerful and controversial AI technique is no longer the domain of Big Tech alone.

article cover — Illustration: Dianna “Mick” McDougall, Photos: Getty Images

July 18, 2022

· 5 min read

Ask an AI expert about the field’s most transformative developments in recent years, and chances are large language models will come up.

Over the past five years in particular, these tools have expanded machine learning’s influence across industries, from analyzing legal documents to summarizing scientific papers to generating predictive text. The models underpin services used by billions of people every day, like Google Search and AutoComplete, and one such system was recently—and controversially—deemed sentient by a Google engineer.

They’ve also received criticism from experts in the field, who say they’re overused, under-vetted, and prone to propagating human biases far and wide.

Until recently, these models have largely been developed and operated by tech giants. But now, a new class of startups are looking to control their own NLP destinies—in some cases, even building and operating their own LLMs. And they’re doing it with the help of hundreds of millions of venture dollars and dozens of high-profile Big Tech hires.

As more companies develop and use LLMs, questions about the ethical use of the technology will only become more pressing. We talked to three startups about how they’re addressing ethics issues as they move into the at-times fraught world of LLMs.

Mix-and-match

Cohere, a Toronto-based NLP startup, has raised over $170 million total, and since February, it has brought on AI experts from Apple, Deepmind, and Meta. In the past year, leaders at Google and Meta have moved to Hugging Face, a NYC-based open-source ML startup that’s raised more than $160 million to date. Adept, an NLP startup that debuted in April with a $65 million Series A, has seven Google Brain alums on its founding team. And Inflection AI, an NLP startup started by Deepmind co-founder Mustafa Suleyman and LinkedIn co-founder Reid Hoffman,with $225 million in funding, has hired alumni from Meta, Deepmind, and Google this year.

Despite all of that activity, leaders from NLP startups Cohere, Hugging Face, and Primer told us the field is still in its early stages.

Cohere’s mission is to make NLP available to any developer anywhere via a toolkit and a user-friendly interface. Instead of using existing LLMs controlled by Big Tech companies, Cohere develops and hosts its own.

The company utilizes a responsibility counsel—a group of expert advisors—and a community Discord group for feedback on its models, and it revokes access to the platform if users don’t comply with the company’s mission and terms of use, Aidan Gomez, co-founder and CEO of Cohere, told us. It also vets data sets for toxicity and hate speech before using them to train its language models.

Primer, for its part, focuses on serving analysts and creates NLP tools for pulling insights from large volumes of content (think: linking together relevant information for certain people, places, or things). The company says its customers are primarily analysts in finance, government, and the commercial sector, including Walmart.

John Bohannon, Primer’s director of science, said that last year he led a cross-team project to assess bias in the company’s NLP models—for example, creating data sets to test how Primer’s named-entity recognition tool performed on names of non-Western origin.

“We wanted to know: If you change any of those things—using gender, or your national origin, or your language origin—does it make our models worse? Which would mean that they’re biased,” Bohannon said. He added, “We actually couldn’t find any. We found a little bit, but…it was kind of surprising. Which either means we have the wrong test—we just don’t have a sensitive-enough probe to measure the bias—or it’s not too bad for these tasks.”

And then there’s Hugging Face, an AI platform company where anyone can upload machine learning models—including LLMs—for community use. So far, more than 100,000 models have been shared on the platform, and half are open-source, Clément Delangue, Hugging Face’s co-founder and CEO, told us.

The company also recently led a year-long project called BigScience, which brought together more than 1,000 volunteer researchers to develop an open-access LLM called BLOOM. The model, which was released last week, is larger than OpenAI’s GPT-3, and it can understand 46 languages—a rarity among English-dominated LLMs.

On its own platform, Hugging Face has implemented model cards for documenting bias in systems shared on the platform. About 25,000 cards have been shared so far, Delangue said, but the practice is not mandatory. The company has also blocked models it has identified as intentionally producing harmful content, such as a recent model trained on a section of 4chan.

Ultimately, Delangue said, no single company can solve bias in language models.

“The openness of the field…will allow us to include more companies, more people, into this process of working on these biases,” Delangue said. “Usually, they actually impact [many] more people who are not represented in tech startups.” He added, “By fostering an open, transparent ecosystem, you actually, in the long run, minimize the ethical risk. That’s the way science has always worked.”

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.