What even is an AI agent?

Why an AI agent is more than a chatbot with extra steps, and the perceive, reason, act loop every framework eventually maps back to.

Sumit Gaur
sumit

I asked a chatbot how much I’d spent on coffee this month. It said it didn’t have access to my bank statements, but offered to walk me through how to estimate it.

Then I gave the same question to a small script that could read my email, find Starbucks receipts, sum them up, check whether $137 was high or low compared to the last six months, and reply with a number. Both used the same underlying model. One felt like a tool. The other felt like an employee.

That gap is the whole field.

TL;DR

  • A chatbot answers. An agent decides, acts, looks at the result, and decides again.
  • The whole thing reduces to one loop that perceives, reasons, and acts.
  • Once that loop is in your head, every framework starts to look like a different way to glue the same three phases together.

The chatbot / agent line

A chatbot takes a message and returns a message. The conversation is the entire interaction. If the answer needs information the model does not have, the chatbot stops there or politely pivots to suggestions.

An agent takes a goal and produces an outcome. Somewhere between input and output it picks up tools, reads results, changes its mind, and tries again. Calling a function is the easy part. The interesting behavior is what an agent does after the function returns.

If you’ve worked in backend systems, the closest analogy is a worker that pulls a job off a queue, retries on failure, and writes the result somewhere durable. Now imagine the worker can also choose which queue to pull from and rewrite its own job description halfway through. That’s an agent.

Perceive, reason, act

Strip away the SDKs, the orchestration, the prompt frameworks, and you’re left with three phases in a loop.

flowchart LR
    A([Goal]) --> B[Perceive]
    B --> C[Reason]
    C --> D[Act]
    D --> E{Done?}
    E -- no --> B
    E -- yes --> F([Outcome])

    classDef perceive fill:#dbeafe,stroke:#1d4ed8,color:#1e3a8a
    classDef reason fill:#fce7f3,stroke:#be185d,color:#831843
    classDef act fill:#dcfce7,stroke:#15803d,color:#14532d
    classDef terminal fill:#f3f4f6,stroke:#374151,color:#111827

    class B perceive
    class C reason
    class D act
    class A,F terminal

Perceive. Read the world. Inputs from a user, results from the last tool call, the contents of a file, an HTTP response, the current state of a queue. A model can only reason about what it can see, so this step is doing more work than it looks like it is.

Reason. Decide what matters and what to do next. With LLMs this happens by writing tokens. The model considers the goal, reflects on what it just observed, and produces either a tool call or a final answer. It is pattern completion that happens to be useful.

Act. Do the thing. Call an API. Run a query. Send a message. Update a file. Hand off to another agent. The action almost always changes the state of the world, which means the next perceive step sees something different than it did a second ago.

That last point is what separates an agent from a glorified switch statement. Each pass through the loop is informed by the consequences of the last one.

Three analogies that actually map

Forget “intelligent assistant.” The phrase is empty. Here are three working analogies, in order of fidelity.

A junior engineer with a ticket reads the ticket (perceive), writes a plan in a notebook (reason), opens a PR (act), and checks the CI output (perceive again). They iterate until the build passes or they get stuck enough to ask. An agent runs this loop without coffee breaks.

A driver with a route sees their current position on GPS (perceive), decides whether to take the next exit (reason), takes it or skips it (act), and the GPS recalculates. If you’ve ever yelled at a GPS that kept rerouting because you missed a turn, you’ve watched a loop fail because perception got noisy.

A smart thermostat senses temperature, compares it to a setpoint, and turns the AC on or off. Replace “compare to setpoint” with an LLM that can also decide “actually, ask the user if they’re awake before cooling the bedroom at 3am” and you have an agent. It is the same loop with deeper reasoning.

The thermostat one tends to land for senior engineers. We’ve been writing perceive-reason-act loops since before agents were called agents. What is new is that the reasoning step now does fuzzy work that used to require human judgment.

What people get wrong

Two failure modes keep showing up.

The first is treating the loop as a black box. Someone says “I asked the agent to do X and it didn’t.” When you don’t watch the loop run, you can’t tell whether the model couldn’t perceive the right input, reasoned poorly, or chose an action that had no path to the goal. Every framework I’ll cover later in this series gives you some way to inspect the loop. Use it from day one.

The second is skipping the goal. A chatbot is fine with vague input because the user is doing the steering. An agent has to convert ambiguity into action without that supervision. If the goal is mushy, the loop runs in circles and burns tokens. The most useful thing you can do for an agent before it runs is sharpen what “done” means.

Why the loop matters for everything that follows

Every post in this series will add one piece to this loop.

  • Tools change what an agent can do in the act phase.
  • Memory changes what it can carry across iterations of perceive.
  • Planning changes how it structures the reason phase.
  • Multi-agent systems change who is running which loop and how they pass work to each other.
  • Guardrails and evals are how you keep the loop from going off the rails and how you measure when it does.

When something goes wrong with an agent, ask which of the three phases broke. That single habit will save you more time than any specific framework.

What’s next

The next post goes inside the reason phase. We’ll look at how LLMs actually do the reasoning that makes the loop work, why prompt design is the same thing as agent design, and what changes when you swap one model family for another. I’ll write a small ReAct-style loop in plain Python with no SDK, so you can see the wires before we start covering them up.

If you’ve already built an agent and ran into something the loop framing doesn’t cover, write to me at sumit at allthingsagentic dot org . The patterns I cover next are partly shaped by what people are actually getting stuck on.

sumit

I'm a backend developer, writer, and tinkerer exploring the world of agentic systems. AllThingsAgentic is a project I started to share what I learn from poking at agents, LLMs, RAGs, and the tooling around them in the open.

© all_things_agentic
Be notified of new posts. Subscribe to the RSS feed