AI agent

An AI agent is an LLM-driven system that takes actions in the world — calling tools, browsing, writing code, finishing tasks — instead of just answering questions.

An AI agent is an LLM wired up with the ability to do things — not just respond, but call functions, browse websites, write to a database, run code, fire off other LLM calls, and loop until a goal is reached.

The minimal anatomy:

A model capable of reasoning and tool use (Claude, GPT-4o, Gemini).
A set of tools the model can call — search, code-execution, file I/O, HTTP, custom APIs.
A loop that runs: model decides what to do → tool runs → result feeds back → model decides next move → repeat until done.

Where agents are actually useful today:

Software engineering agents — Claude Code, Cursor agent mode, Devin. Read repos, edit files, run tests, iterate until builds pass.
Research/browsing agents — pull data from many web pages, synthesize a report.
Customer-support agents — read past tickets, check internal databases, draft a response, escalate if confidence is low.
Data agents — given a question, write the SQL, run it, plot the result, write a summary.

Where they're brittle today:

Long-horizon tasks with ambiguity — agents drift, get stuck, or confidently do the wrong thing.
Anywhere precision matters — finance, legal, medical: the human-in-the-loop pattern beats the autonomous one.
Cost — agents loop, and each loop costs tokens. A "simple" task can rack up dollars without close monitoring.

The big architectural question for any agent project is "how much autonomy?" The most reliable pattern in production is constrained agents — narrow scope, well-defined tools, hard limits on loop count, human checkpoints. Full-autonomous agents make for great demos and frustrating products.

Related on ToolMango

Cursor

The AI-first code editor.

Claude

Anthropic's long-context AI for serious writing and reasoning.

FAQ

How is an agent different from a chatbot?

A chatbot replies with text. An agent takes actions — writes files, calls APIs, executes code. The interface might look the same, but the impact on the world is very different.

Are agents going to replace SaaS apps?

Some, eventually. Single-purpose SaaS where the value is a workflow (data entry, scheduling, simple analysis) is at risk. Multi-stakeholder, regulated, or trust-heavy products are not.

Related terms

LLM (Large Language Model) — A Large Language Model is a neural network trained on huge volumes of text to predict the next token, which produces emergent capabilities like reasoning, code generation, and translation.
Prompt engineering — Prompt engineering is the craft of writing instructions to a language model so it produces reliable, accurate, useful outputs.
Chain-of-thought (CoT) — Chain-of-thought prompting asks a model to reason step-by-step before producing its final answer, which substantially improves accuracy on hard problems.
MCP (Model Context Protocol) — MCP is an open standard from Anthropic for connecting AI models to external tools, data sources, and applications through a consistent client-server protocol.

Want to actually build with this?

Our Stack Builder picks the best AI tools for your specific project in under 60 seconds.

Build my stack →