ChatGPT likes to fight. For military AI researchers, that’s a problem

Chatbots appear to make better bureaucrats than battlefield generals—at least for now.

May 14, 2024

• 5 min read

What happens when you bring together the NiCEst person on earth and the best AI platform for customer service? You get service that just gets you. Kristen Bell + the smartest AI platform = service that’s seamless, personal, efficient… it’s so NiCE.

Even though they’re still fairly new, AI-powered large language model chatbots have taken on many roles: customer service rep, traffic-ticket negotiator, maybe even wedding vow author. But there’s one big job AI isn’t ready for: five-star general.

Jacquelyn Schneider, a fellow at the Hoover Institution public policy think tank, runs war games that simulate conflict scenarios, and she’s been documenting the results as AI plays alongside human participants. In a policy brief released this month, she and several co-authors observed that three of OpenAI’s ChatGPT versions as well as models by Anthropic and Meta all “show forms of escalation and difficult-to-predict escalation patterns that lead to greater conflict and, in some cases, the use of nuclear weapons.”

The researchers found that these unpredictable behaviors can vary widely depending on the bot and its version. For example, earlier versions of ChatGPT “are more likely to think of winning in a really kind of absolutist way,” Schneider told Tech Brew.

The results matter because the military reportedly is exploring how it can leverage AI capabilities, according to Foreign Affairs. The US Department of Defense stood up the Chief Digital and Artificial Intelligence Office in 2022 to begin that exploration and laid out an adoption strategy for AI tech late last year.

And some AI technology is already showing up in military contexts in the real world: Israel is reportedly using AI to locate targets in Gaza, and the US has reportedly used AI to identify targets in the Middle East.

Schneider’s research, however, suggests the tech is far from being reliable in a high-stakes environment.

Although the bots’ preference for escalation evened out with some fine-tuning, there are other reasons to be concerned about using AI to execute high-level combat decisions. Given that the robots aren’t sentient (at least not yet!), they can only guess at what a human might do.

“These systems are never fully internalizing these ethical choices. And so they are at best mimicking the way humans make choices. And because of that, you can never be sure that they will make a truly ethical choice,” Schneider said. “They’re not actually reasoning through ethical conundrums. They are mimicking how they think humans do it.”

Automation advantages

So when is AI actually a good fit for military use? At the moment, it appears to be better suited for offices than battlefields.

She noted that personnel decisions such as rank advancement and promotions, training, logistics management, and travel budgeting are all good examples of low-level, routine decisions that could benefit from automation.

“This is where I think you see a real use for artificial intelligence…to be able to process what is a pretty remarkable amount of language-based military tasks that are pretty rote, but require a huge amount of labor,” Schneider said.

AI and automation can also be appropriate during some instances when commanders are concerned they’ll lose contact with troops in the field, she said. In that case, AI could generate instructions to help units carry on in the event of a communication failure.

Think of it as the modern solution to the “quintessential submarine problem,” she said, when vessels were likely to lose contact with superiors at some point and carried hard copies of backup instructions on board.

Gut check

In Schneider’s war games, she sees another key task for AI: helping diplomats think outside the box, identify blind spots, and assess what an adversary might do. In games where a chatbot plays a foreign power, its unpredictability may actually be an asset.

“In some ways, I’m more interested in what my outlier players do, because Putin is not like the majority of my players,” she said.

Schneider is currently building a war-game model in which AI can function as a more accurate “red team” player, mimicking behaviors of complex US adversaries, like the Chinese or Russian governments. It can be difficult to find human players with enough cultural understanding to realistically embody these roles, Schneider said, and AI can point out when “there’s some significant thing that we’re missing, that we didn’t see before.”

“There’s definitely this idea of it becoming complementary, where it doesn’t substitute for human decision-making, but instead provides humans with options, or says, here’s the way the machine views the situation,” Schneider said. “It becomes like a check.”

So far, each military branch has a different stance on studying and potentially deploying AI, and it remains to be seen how AI will factor into future military operations.

In the meantime, the policy brief urges decision-makers to “proceed with utmost caution when confronted with proposals to use LLMs and LLM-based agents in military and foreign policy decision-making.”

“I think there is a general desire for technological silver bullets, and then to use technology in what is kind of viewed as the most impactful way—which is generally thinking about how we use weapons and force,” Schneider said. “And yet, it is the ways in which artificial intelligence and data solve these larger, systemic tasks that potentially is more revolutionary.”

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.