By Thomas Şerban von Davier
The year 2025 marked a decisive shift in artificial intelligence (AI). Systems once confined to research labs and prototypes began to appear as everyday tools. At the centre of the transition was the rise of AI agents, the systems that can use other software tools and act on their own.
While researchers have studied AI for more than 60 years, and the term “agent” has long been part of the field’s vocabulary, 2025 was the year the concept became concrete for developers and consumers.
AI agents moved from theory to infrastructure, reshaping how people interact with large language models, the systems that power chatbots such as ChatGPT.
In 2025 the definition of AI agent shifted from the academic framing of systems that perceive, reason and act to AI company Anthropic’s description of large language models capable of using software tools and taking autonomous action. While large language models have long excelled at text-based responses, the recent change is their expanding capacity to act, using tools, calling application programming Interfaces, coordinating with other systems and completing tasks independently.
The shift did not happen overnight. A key inflection point came in late 2024 when Anthropic released the model context protocol. The protocol allowed developers to connect large language models to external tools in a standardised way, effectively giving models the ability to act beyond generating text. With that, the stage was set for 2025 to become the year of AI agents.
Milestones that defined 2025
The momentum accelerated quickly. In January, the release of the Chinese model DeepSeek-R1 as an open-weight model disrupted assumptions about who could build high-performing large language models, briefly rattling markets and intensifying global competition. An open-weight model is an AI model whose training, reflected in values called weights, is publicly available.
Throughout 2025, major US labs such as OpenAI, Anthropic, Google and xAI released larger, high-performance models, while Chinese tech companies including Alibaba, Tencent and DeepSeek expanded the open model ecosystem to the point where the Chinese models have been downloaded more than American models.
AI agents expanded what individuals and organisations could do, but they also amplified existing vulnerabilities. Systems that were once isolated text generators became interconnected, tool-using actors operating with little human oversight
Another turning point came in April when Google introduced its Agent2Agent protocol. While Anthropic’s model context protocol focused on how agents use tools, Agent2Agent addressed how agents communicate with each other. Crucially, the two protocols were designed to work together. Later in the year, Anthropic and Google donated their protocols to the open-source software nonprofit Linux Foundation, cementing them as open standards rather than proprietary experiments.
The developments quickly found their way into consumer products. By mid-2025 “agentic browsers” began to appear. Tools such as Perplexity’s Comet, Browser Company’s Dia, OpenAI’s GPT Atlas, Copilot in Microsoft’s Edge, ASI X Inc’s Fellou, MainFunc.ai’s Genspark, Opera’s Opera Neon and others reframed the browser as an active participant rather than a passive interface. For example, rather than helping you search for holiday details, it plays a part in booking the vacation.
At the same time, workflow builders such as n8n and Google’s Antigravity lowered the technical barrier for creating custom agent systems beyond what has already happened with coding agents such as Cursor and GitHub Copilot.
New power, new risks
As agents became more capable, their risks became harder to ignore. In November, Anthropic disclosed how its Claude Code agent had been misused to automate parts of a cyberattack. The incident illustrated a broader concern: by automating repetitive, technical work, AI agents can also lower the barrier for malicious activity.
The tension defined much of 2025. AI agents expanded what individuals and organisations could do, but they also amplified existing vulnerabilities. Systems that were once isolated text generators became interconnected, tool-using actors operating with little human oversight.
What to watch for in 2026
Looking ahead, several open questions are likely to shape the next phase of AI agents.
One is benchmarks. Traditional benchmarks, which are like a structured exam with a series of questions and standardised scoring, work well for single models, but agents are composite systems made up of models, tools, memory and decision logic. Researchers increasingly want to evaluate not only outcomes, but also processes. This would be like asking students to show their work, not only provide an answer.
Progress will be critical for improving reliability and trust, and ensuring an AI agent will perform the task at hand. One method is establishing clear definitions around AI agents and AI workflows. Organisations will need to map out exactly where AI will integrate into workflows or introduce new ones.
Another development to watch is governance. In late 2025, the Linux Foundation announced the creation of the Agentic AI Foundation, signalling an effort to establish shared standards and best practices. If successful, it could play a role like the World Wide Web Consortium in shaping an open, interoperable agent ecosystem.
There is also a growing debate over model size. While large, general-purpose models dominate headlines, smaller and more specialised models are often better suited to specific tasks. As agents become configurable consumer and business tools, whether through browsers or workflow management software, the power to choose the right model increasingly shifts to users rather than labs or corporations.
Challenges ahead
Despite the optimism, significant sociotechnical challenges remain. Expanding data centre infrastructure strains energy grids and affects local communities. In workplaces, agents raise concerns about automation, job displacement and surveillance.
From a security perspective, connecting models to tools and stacking agents together multiplies risks that are unresolved in standalone large language models. Specifically, AI practitioners are addressing the dangers of indirect prompt injections, where prompts are hidden in open web spaces that are readable by AI agents and result in harmful or unintended actions.
Regulation is another unresolved issue. Compared with Europe and China, the US has relatively limited oversight of algorithmic systems. As AI agents become embedded across digital life, questions about access, accountability and limits remain largely unanswered.














Would you like to comment on this article?
Sign up (it's quick and free) or sign in now.
Please read our Comment Policy before commenting.