Siri pioneer says AI push could help Apple’s digital assistant catch up

Babak Hodjat, now Cognizant’s CTO of AI, also talked about why he thinks multi-agent architecture is the future for generative AI.
article cover

Babak Hodjat

7 min read

Siri is getting perhaps its biggest makeover ever, courtesy of ChatGPT and other generative AI systems.

Babak Hodjat, the inventor behind the NLP tech that paved the way for Siri, said it’s about time. Hodjat contends that Siri had fallen behind some of its rivals like Alexa but that Apple’s recent rollout of its big AI play has improved its position.

Now the CTO of AI at IT and consulting firm Cognizant, Hodjat oversees a new San Francisco-based generative AI lab born out of the firm’s recent pledge to sink $1 billion in the budding technology. Hodjat said the lab aims to bridge the gap between academic research and practical enterprise use cases.

These days, Hodjat can’t stop talking about multi-agent architecture—that is, constellations of task-specific AI models that can coordinate as part of a bigger network. While LLM agents are having a moment right now, the general idea is not new; Siri itself was born out of a multi-agent architecture system, and Hodjat has been doing work around the concept for decades.

We spoke with Hodjat about what he thinks of Apple’s big AI play, why we’ve come full circle on agents, and whether we’re headed for an AI winter.

This conversation, the first of two parts, has been edited for length and clarity. (Part two is here.)

What do you think of the evolution of Siri and some of these other voice chatbots? And what is it like to see that, given your early work on that technology?

Well, Siri was well ahead of its time and then fell well behind its time. So I was very happy with the announcement that Apple made…with regards to how they’re going to infuse it with generative AI and their work with OpenAI. I think that will bring it back up to date.

But what’s interesting about Siri is when I was involved—I was the main inventor of the natural language technology behind Siri—it was agent-based. And what’s amazing is that the fact that it was agent-based was part of the reason why it was able to be so robust and extensible back in the time…And in some ways, AI is moving back into an agent-based world. Now, when you think about—and I don’t know this for sure—but when you think about how Apple is talking about running large language models on your phone and then having, you know, larger GenAI-based systems on the server side, I jump to the conclusion that there might be this give and take, this discussion between different LLMs representing different functionality, ranging from on the phone and then going on to the server. And that’s an agent-based system. So anyway, that excites me these days.

How did Siri fall behind?

You know how it is, right? Like technology ends up in a large company. It ends up having many different owners. And I think that’s kind of what happened. And they were slow to update it; it even fell behind things like Alexa. Like, if you play around with Alexa, you already feel like it understands you better, has more functionality, is snappier, even better text-to-speech. And I think Siri fell behind from that perspective. Why? I don’t know, it might be a number of different reasons, but of course, the technology, I think, could have been much better already.

With generative AI, it was very clear to those of us in the field when, for example, GPT-2 came out [in early 2019], that there’s something powerful here. We didn’t quite expect it to be as powerful as it ended up, but it is a language-centric system. Like, it’s like natural language in and functionality out, or natural language in and some emergent behavior. So they should have—and all big companies should have—taken it seriously and pulled it into, I mean, the obvious use case is sort of some sort of a chatbot interface. But then, you think about Apple, they’re never the first. They’re usually the best. Quality is really good, but they’re not usually the first. And I think that what they announced last week is the right way to go about it. So hopefully, they’ll catch up.

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.

What is Apple doing right?

I think the fact that they’re actually viewing various different user and interface functionality as something that could be augmented using generative AI. They’ve looked at specific use cases again, that kind of agent-based perspective. I’m not saying they are agent-based, I’m just saying that perspective is going to help them. Like, oh, I can use a small language model that’s running on the device that I can talk to in natural language to create a smiley face. I mean, that might sound really small, but it is one of many things that you can now do using large language models, and that collection suddenly elevates your user experience. And Apple being Apple, that’s what they care about.

And then harmonizing those little agents to work together, it’s…my kind of conjecture that maybe that’s where they’re going with this, but I think that’s ultimately the operating system of the future. And so if they’re on that track, I think they’re on the right track, versus, “Oh, we’re going to pump a ton of money and create a yet larger language model” akin to, I don’t know, Gemini, or GPT-4, or whatever else that is even more powerful. Do you really need to do that? In many use cases, you don’t. Apple, again being Apple, they’ve gone with the best: Right now, GPT-4 is more or less the standard bearer. I mean, that might change overnight. Who knows? Gemini is very close. There might be other models that come in and overtake them, but, acknowledging the fact that they’re behind when it comes to large language models and actually partnering with someone like OpenAI and bringing that in, I think that’s a good step for them.

You said you’ve been working in AI since the late ’80s. Do you think the moment we are in now is unique?

We’ve gone through several AI springs and winters. AI, by its nature, just draws expectations. Like every time an AI system comes out, everybody’s like, “Oh, it’s got to be not just smarter than a human, smarter than all humans, maybe not even that, like, smarter than God.” It needs to know what’s going to happen in the future. And so you’re faced with expectations that are never going to be met, and that kind of leads to, “Oh, a whole bunch of investment goes in, disillusionment, we go to a winter.” This time around, it feels different because it’s so widely useful. So if we can get beyond the fact that, no, this is not AGI and it’s not as powerful as God, that there’s still so much that it can do, and that we’re just scratching the surface, I think it’ll stick around.

But yeah, no, there’s nothing new about that. One other thing I would say really briefly is that in the ’90s, people started looking at agent-based systems for a very, very different reason. Back then, AI wasn’t as powerful. I mean, that’s the reason I used it for the Siri natural language systems. AI wasn’t that powerful, so you had to simplify the domain within which an AI system was operating. And then now you had a lot of these idiot-savant AI agents that you could have work together to enable something like Siri. That was the reason back then. Now it’s the inverse of that; the AI systems are so powerful that one large language model could be many, many, many different personas, could do many, many different things. You have to kind of restrict it. So the agent-based perspective now is because you want to tell it: Here’s what you need to do for me, and here’s how you do it, and here’s the tools that you need to use, and don’t do other things, just do this, And then you plug it in as sort of a knowledge worker in a box, in a workflow where these systems work together to get something done. So in some kind of strange, interesting way, we’ve come full circle with agent-based.

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.