Summary: If you didn’t get the memo yet, 2025 is going to be the year of AI agents. Vendors aren’t afraid to make big promises, but does the technology have what it takes? And what if it doesn’t live up to the expectations?
↓ Go deeper (8 min)
If 2024 was the year of RAG, 2025 is the year of AI agents. Or so everyone would have you believe.
Salesforce is all-in on Agentforce, Microsoft’s Copilot Chat is giving every employee the ability to build custom AI agents, and Atlassian launched Rovo Agents, “virtual teammates that tackle specialized tasks with you”.
While vendors left and right make big promises of autonomous systems that can reason, plan, and execute, C-suite executives face the daunting task of distinguishing real innovation from marketing hyperbole. Something that is virtually impossible without intimate knowledge of what the current technology is and isn’t capable of.
As someone who has been in the business of building chatbots for quite some time, it’s certainly an interesting sight to behold.
What is an AI agent?
As of late, the term AI agent has undergone some inflation. Every chatbot that leverages an LLM in some way, shape or form, is basically being marketed as one.
Historically, the standards for what constituted an agent were considerably higher. In Intelligent Agents: Theory and Practice, published in 1995, authors Michael Wooldridge and Nick Jennings write:
“Perhaps the most general way in which the term agent is used is to denote a hardware or (more usually) software-based computer system that enjoys the following properties:
autonomy: agents operate without the direct intervention of humans or others, and have some kind of control over their actions and internal state;
social ability: agents interact with other agents (and possibly humans) via some kind of agent-communication language;
reactivity: agents perceive their environment, (which may be the physical world, a user via a graphical user interface, a collection of other agents, the Internet, or perhaps all of these combined), and respond in a timely fashion to changes that occur in it;
pro-activeness: agents do not simply act in response to their environment, they are able to exhibit goal-directed behavior by taking the initiative.”
I’d say today’s AI agents meet some of the criteria, but not all. When you purchase Copilot licenses for your entire team, these chatbots lack the autonomy or the ability to pursue long-term goals.
And while attempts have been made to provide AI agents with the tools and the freedom to operate autonomously — Anthropic’s Claude’s Computer Use, Cursor and Devin are examples of AI agents in the true sense of the word — they remain brittle.
The agentic AI spectrum
This is not to say they aren’t useful. They are, when applied to the right problems.
recently wrote that it is best to view agentic AI as a spectrum, and I agree. To illustrate, I crafted this terribly simplified graph:
The more agentic your system is, the less control you have over its output. This is why LLMs are being bootstrapped onto orchestration layers, and why we’re seeing the emergence of multi-agent architectures (which shouldn’t be confused for a chatbot with multiple personalities).
While it buys you greater flexibility, there’s no such thing as a free lunch. The more agentic your system is, the longer it needs to run, the more expensive it becomes, and the more unpredictable it tends to be.
Even agents with very little agenticness can turn out to be unreliable. This is best demonstrated by OpenAI’s recent launch of ‘Tasks’, which allows ChatGPT to set timers and… well, see for yourself:
In other news, Apple announced it is suspending their AI-generated summaries until further notice, with a spokesperson for Apple explaining that “the Apple Intelligence features are still in their infancy”.
The need for design thinking
It goes to show that it’s hard to get this stuff right — even if you have the brightest minds in the world working for you.
In many ways, LLMs are still a solution in search of a problem. And I say this with the best of intentions. There is just so much that we haven’t figured out yet.
To be clear, the technology is groundbreaking. A new approach to building chatbots has de facto arrived, but what seems to be missing is the right mindset. A design mindset. Very few people ask themselves the question: what problem are we trying to solve and do we even need an agent for that? Too often LLMs are used to deliver features no one asked for.
This is as relevant for vendors as it is for the organizations trying to leverage AI. When chatting to a Technical Specialist at one of the big platform providers, I was surprised to hear:
“I spend a lot of time educating. It’s not easy to convince customers to focus on the user experience (…) IT can derail the whole thing. They have a tendency to think that if the solution is not complicated, it is not a good solution.”
Remember: if you don’t understand your users, you don’t know who you’re designing for. And if you don’t know who you’re designing for, you don’t know what problem you’re trying to solve.
Your users don’t care if it’s called an AI agent, chatbot or virtual assistant, they just want to get sh*t done.
Have a great week,
— Jurgen
Can’t get enough?
Here are some of my recent articles that you may have missed:
Great article! Thanks so much for touching on this topic. Refreshing.
There is not a lot that I would outsource to an autonomous agent that merely runs on LLM technology.