How Microsoft and Google use AI red teams to “stress test” their systems

Since 2019, some Big Tech firms have implemented AI red teams to reveal shortcomings, biases, and security flaws.

article cover — *Dianna “Mick” McDougall/Getty Images*

June 14, 2022

• 5 min read

What happens when you bring together the NiCEst person on earth and the best AI platform for customer service? You get service that just gets you. Kristen Bell + the smartest AI platform = service that’s seamless, personal, efficient… it’s so NiCE.

It was a snowy day in February, and Amanda Minnich was attacking an AI system.

With one block of code—and no other details—she needed to hack into one of the most complex machine learning systems operated by a Microsoft partner. Minnich tried a few different approaches, first attempting to use a single image to confuse the system, then trying it with multiple images. Finally, she made a last-ditch effort to hoodwink the AI by replaying a sequence of images on a constant loop—Minnich described it as being like Ocean’s Eleven, where the robbers fool security by replacing the live feed with older security-camera footage.

It worked: She was in, with control over the AI system.

Microsoft congratulated Minnich—breaking into AI systems was her job, after all. As a member of Microsoft’s AI “red team,” Minnich helps stress-test the company’s ML systems—the models, the training data that fuels them, and the software that helps them operate.

“Red teams” are relatively new to AI. The term can be traced back to 1960s military simulations used by the Department of Defense and is now largely used in cybersecurity, where internal IT teams are tasked with thinking like adversaries to uncover systems vulnerabilities. But since 2019, Big Tech companies like Microsoft, Meta, and Google have implemented versions of AI red teams to reveal shortcomings, bias, and security flaws in their machine learning systems.

It’s part of a larger push for AI ethics and governance in recent years, in corporate boardrooms and on Capitol Hill. Gartner picked “smarter, more responsible, scalable AI” as 2021’s top trend in data and analytics, and there are industry moves to back it up: In 2021, Twitter introduced an ML Ethics, Transparency, and Accountability team that led the first-ever “bug bounty” for AI bias; Google promised to double its AI ethics research staff and increase funding last year; and in February, senators introduced a new version of the Algorithmic Accountability Act.

Get red-y

Tech companies differ on how they frame their AI red team strategy.

Google’s team, for instance, starts with assessing a product’s use case, intended and unintended users, and fairness concerns, according to Anne Peckham, program manager for Google’s ProFair (proactive algorithmic product fairness testing) program, which is on the company’s central Responsible Innovation team. Then they create a methodology to stress-test the product in terms of “sensitive characteristic[s]” like a user’s religion, gender, or sexual orientation.

In testing one of Google’s image-classification ML models, for example, the ProFair team would test to see if the ML model performed equally across different skin tones.

“Then the team would collect data to perform the test,” Peckham said. “So this might be creating an adversarial data set for testing; we might have a live kind of simulation room to try to break the product in certain ways in real-time, kind of all playing off each other, saying, ‘I’m seeing this potentially concerning result—can someone else try it in this other way?’ to try to learn from each other as we’re going about it. And then the final step would be really evaluating the results that we get and working with the submitting team on mitigation.”

Those recommendations could look like adding more representative training data, adding a model card for explaining intended use and the fairness-testing process, offering clear consumer controls, or recommending a focus group of users to help better understand potential fairness concerns.

Both Google and Microsoft typically work on red-team projects for anywhere between one-and-a-half and three months, according to Peckham and Ram Shankar Siva Kumar, a “data cowboy” on Microsoft’s AI red team. Both companies declined to share what percentage of projects are escalated to this level, or what criteria lead to it.

For its part, Microsoft’s team is governed by Microsoft’s AI Red Team Board—composed of executives across different sections of the company—and centers on top-to-bottom AI risk assessments that attack a case from multiple angles, Siva Kumar said.

A security-focused team member might look into open-source software vulnerabilities, Siva Kumar said, while an adversarial researcher may focus on how likely the model is to “spit out private information,” whether it’s possible to “steal the underlying ML model,” and whether it’s possible for someone to “reconstruct the private training data” fueling a model. Then software developers and engineers build toolkits to help Microsoft guard against common attacks and vulnerabilities.

“When you think about holistic AI red-teaming, it is by whatever means of failure that we can entice [from] a system is what we want to look at,” Siva Kumar said. “So it’s across a spectrum: It’s not just the ML model. It’s the underlying training data, the underlying software, but also where the ML model resides in the context of a bigger system.”

“This is literally like an ocean of new vulnerabilities,” he said.

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.