AI Agents and Workflows

5 lessons

Module Progress...
AI Agents and Workflows

Workflows vs Agents

A workflow is fast and predictable. An agent is flexible and expensive. Build both from scratch, see exactly where the workflow breaks, and know when to reach for the agent.

Most engineers reach for agents first. That's backwards. Agents can be expensive, slow, and unpredictable. A workflow (fixed pipeline) handles the majority of real-world LLM tasks without the overhead. You should only go for an agent after a simpler approach has clearly failed.

In this tutorial, we'll cover two things. First, the conceptual landscape1: what separates a workflow from an agent, and how to pick the right level of autonomy for your problem. Second, we build a working agent that calls real tools, running entirely on your machine. We'll add tools and let the LLM decide what to do next.

Tutorial Goals

  • Understand the spectrum from workflows to multi-agent systems
  • Know when a workflow suffices and when you need agency
  • Define tools with schemas the LLM can understand
  • Build a multi-tool agent and compare it to a workflow

The Spectrum of Autonomy

Think of LLM orchestration as a spectrum with three distinct levels. Each level trades control for capability:

LevelWhat DecidesExampleLatencyCost
WorkflowYou (fixed or conditional logic)Prompt → LLM → Parse, or route A vs B0-3 LLM callsLow
AgentThe LLM"Figure out which tool to call"2-10+ LLM callsMedium
Multi-AgentMultiple LLMsResearcher → Analyst → Writer10+ LLM callsHigh

Workflows

A workflow is code you control. The simplest version is a fixed pipeline where input goes in and output comes out. "Summarize this document". "Translate this text". "Extract these fields". This is predictable, and easy to debug.

The more interesting version adds conditional branching. Your code inspects the input and routes it down different paths based on the input:

Workflow with Routing
Workflow with Routing

A customer service router is a good example. Message mentions "billing"? Route to the billing team. Mentions "password"? Tech support. A great win here is that you can route using a "weaker" model and use a "stronger" model for generating the final response.

Agents

An agent flips the control model. Instead of you writing the routing logic, the LLM decides what to do next2:

Agent Loop
Agent Loop

The LLM inspects the query, picks a tool, reads the result, then decides whether to call another tool or return a final answer. This loop can run multiple times. Every iteration burns tokens and adds latency, so the flexibility comes at a measurable price.

Multi-Agent Systems

Multiple agents collaborate, each with different tools and expertise. A supervisor agent routes tasks to specialists (just one of many possible patterns). This is the heavy artillery, reserve it for problems that genuinely decompose into subtasks requiring distinct skills. Most problems don't qualify.

The Decision Framework

How to pick the right level of autonomy? It's simple - start at the lowest level that solves your problem:

Decision Framework
Decision Framework

The trap is reaching for agents when a workflow would do. The best production systems start with embedding domain knowledge in a workflow, and only reach for an agent when the workflow fails.

Setup

Terminal
git clone git@github.com:mlexpertio/academy.git
cd academy/agents
uv sync

Pull the model we'll use:

Terminal
ollama pull gemma3:4b
agents/simple_agent.py
8 collapsed lines
import json
import yfinance as yf
from langchain.chat_models import init_chat_model
from langchain_core.messages import HumanMessage, SystemMessage, ToolMessage
from rich.console import Console
from rich.markdown import Markdown
from rich.panel import Panel
console = Console()
model = init_chat_model(
"qwen3:8b", model_provider="ollama", reasoning=False, n_ctx=16384, seed=42
)

Define Tools

An LLM generates text. That's all it can do. It can't fetch stock prices, query databases, or hit APIs. Tools bridge that gap, your functions (tools) do the work, the LLM just requests it.

Tool Calling
Tool Calling

The LLM never executes a tool itself. It outputs a structured request (a function name and arguments) and your code decides whether to run it. You can validate arguments, rate-limit calls, or reject entirely. This request-execute-respond loop is how every agent works under the hood.

Each tool needs three things: a function, type hints so the LLM knows what arguments to pass, and a docstring that tells it when to use the tool. The docstring matters most, the LLM reads it to pick the right tool for a given request. Let's define a tool to get the stock price for a given ticker symbol:

agents/simple_agent.py
def get_stock_price(ticker: str) -> str:
"""Get the current stock price and market cap for a given ticker symbol.
Args:
ticker: Stock ticker symbol (e.g., 'AAPL', 'GOOGL', 'MSFT')
"""
stock = yf.Ticker(ticker)
info = stock.info
price = info.get("currentPrice")
if price is None:
raise ValueError(
f"No price found for '{ticker}'. Verify the ticker symbol is correct."
)
name = info.get("shortName", ticker)
currency = info.get("currency", "USD")
market_cap = info.get("marketCap")
parts = [f"{name} ({ticker}): {currency} {price}"]
if market_cap:
parts.append(f"Market Cap: {market_cap:,}")
return " | ".join(parts)

The type hints (ticker: str, -> str) tell the LLM what to pass and what to expect back. The docstring and Args block become the tool's description in the schema.

The next tool we'll define is to convert a USD amount to another currency:

agents/simple_agent.py
def convert_currency(amount: float, to_currency: str) -> str:
"""Convert a USD amount to another currency.
Args:
amount: Amount in USD to convert
to_currency: Target currency code (e.g., 'EUR', 'GBP', 'JPY')
"""
rate = EXCHANGE_RATES.get(to_currency.upper())
converted = amount * rate
return f"${amount:,.2f} USD = {converted:,.2f} {to_currency.upper()} (rate: {rate})"

LangChain converts your Python function into a JSON schema3 that gets sent alongside every prompt. The model reads the schema and constructs a valid tool call when it decides the tool is relevant. You can inspect the schema directly:

agents/simple_agent.py
schema = convert_to_openai_function(convert_currency)
console.print(
Panel(
Syntax(json.dumps(schema, indent=2), "json", theme="one-dark"),
title="Tool Definition",
border_style="cyan",
)
)
Output
{
"name": "convert_currency",
"description": "Convert a USD amount to another currency.",
"parameters": {
"properties": {
"amount": {
"description": "Amount in USD to convert",
"type": "number"
},
"to_currency": {
"description": "Target currency code (e.g., 'EUR', 'GBP', 'JPY')",
"type": "string"
}
},
"required": ["amount", "to_currency"],
"type": "object"
}
}

Notice the structure: function name, description, and a parameters object with property types and descriptions. Everything the model needs to construct a valid call is extracted automatically from your Python type hints and docstring.

Build the Workflow

We'll start with a simple workflow. It calls the LLM once, executes whatever tools it requests, then calls the LLM a second time to generate the answer:

agents/simple_agent.py
TOOLS = [get_stock_price, convert_currency]
TOOL_MAP = {"get_stock_price": get_stock_price, "convert_currency": convert_currency}
agent_model = model.bind_tools(TOOLS)
def execute_tool_calls(messages, tool_calls, step_label=""):
for call in tool_calls:
prefix = f" [dim]{step_label}[/dim] " if step_label else " "
console.print(
f"{prefix}[dim]Tool call:[/dim] {call['name']}({json.dumps(call['args'])})"
)
try:
output = TOOL_MAP[call["name"]](**call["args"])
console.print(f" [green]✓[/green] {output}")
except (ValueError, TypeError, KeyError) as e:
output = f"ERROR: {e}"
console.print(f" [red]✗[/red] {output}")
messages.append(ToolMessage(content=output, tool_call_id=call["id"]))

bind_tools attaches the tool JSON schemas to the model so every call includes them. execute_tool_calls dispatches each tool call, catches errors, and appends the result back to the message history as a ToolMessage.

Our workflow is so simple that we don't need a library (except LangChain) to run it:

agents/simple_agent.py
def run_workflow(query: str) -> str:
messages = [SystemMessage(content=SYSTEM_PROMPT), HumanMessage(content=query)]
response = agent_model.invoke(messages)
messages.append(response)
execute_tool_calls(messages, response.tool_calls)
return agent_model.invoke(messages).content

Run It

Time to test it with some queries:

agents/simple_agent.py
queries = [
"What's Apple's current stock price?",
"How much is NVIDIA trading for right now?",
"What's Apple's stock price in Japanese yen?",
]
for query in queries:
console.print(f"[bold]Query:[/bold] {query}\n")
answer = run_workflow(query)
console.print(Markdown(f"**Answer:** {answer}"))
console.print("─" * 80 + "\n")
Output
Query: What's Apple's current stock price?
Tool call: get_stock_price({"ticker": "AAPL"})
✓ Apple Inc. (AAPL): USD 266.18 | Market Cap: 3,912,293,679,104
Answer: Apple's current stock price is USD 266.18, and its market cap is USD 3.912 trillion.
Query: How much is NVIDIA trading for right now?
Tool call: get_stock_price({"ticker": "NVDA"})
✓ NVIDIA Corporation (NVDA): USD 191.55 | Market Cap: 4,663,668,113,408
Answer: NVIDIA Corporation (NVDA) is currently trading at $191.55 with a market cap of $4.66 trillion.
Query: What's Apple's stock price in Japanese yen?
Tool call: get_stock_price({"ticker": "AAPL"})
✓ Apple Inc. (AAPL): USD 266.18 | Market Cap: 3,912,293,679,104
Answer:

The first two queries work perfectly. The model calls get_stock_price, gets the result, and formats a clean answer. But look at the third query: "Apple's stock price in Japanese yen." The workflow calls get_stock_price and gets a USD price, but it never calls convert_currency, because it can't. The workflow only runs one round of tool calls. This is where the workflow hits its ceiling, and we need an agent (a loop).

Build the Agent

The agent replaces the fixed two-step flow with a loop. After each LLM response, we check if the model requested any tool calls. If yes, execute them and loop back. If no, the model is done, so a final answer is returned. The MAX_STEPS guard prevents runaway loops:

agents/simple_agent.py
def run_agent(query: str) -> str:
messages = [SystemMessage(content=SYSTEM_PROMPT), HumanMessage(content=query)]
for step in range(MAX_STEPS):
response = agent_model.invoke(messages)
messages.append(response)
if not response.tool_calls:
return response.content
execute_tool_calls(messages, response.tool_calls, step_label=f"Step {step + 1}")
return response.content

That's the entire agent. Same tools, same model, same execute_tool_calls helper. The only difference is the loop. But that loop changes everything, the LLM decides the execution path.

Run It

Running the agent is the same as running the workflow:

agents/simple_agent.py
for query in queries:
console.print(f"[bold]Query:[/bold] {query}\n")
answer = run_agent(query)
console.print(Markdown(f"**Answer:** {answer}"))
console.print("─" * 80 + "\n")
Output
Query: What's Apple's current stock price?
Step 1 Tool call: get_stock_price({"ticker": "AAPL"})
✓ Apple Inc. (AAPL): USD 266.18 | Market Cap: 3,912,293,679,104
Answer: Apple's current stock price is USD 266.18, and its market cap is USD 3.912 trillion.
Query: How much is NVIDIA trading for right now?
Step 1 Tool call: get_stock_price({"ticker": "NVDA"})
✓ NVIDIA Corporation (NVDA): USD 191.55 | Market Cap: 4,663,668,113,408
Answer: NVIDIA Corporation (NVDA) is currently trading at $191.55 with a market cap of $4.66 trillion.
Query: What's Apple's stock price in Japanese yen?
Step 1 Tool call: get_stock_price({"ticker": "AAPL"})
✓ Apple Inc. (AAPL): USD 266.18 | Market Cap: 3,912,293,679,104
Step 2 Tool call: convert_currency({"amount": 266.18, "to_currency": "JPY"})
✓ $266.18 USD = 39,793.91 JPY (rate: 149.5)
Answer: Apple's stock price is $266.18 USD, which is equivalent to ¥39,793.91 JPY.

The first two queries complete in one step, same as the workflow. But look at the third query:

  1. The agent fetches the USD price
  2. It realizes it needs a currency conversion and calls convert_currency with the exact price it just received

Two tools, chained automatically, because the LLM decided it wasn't done yet. That's the core difference. The workflow couldn't do this. The agent can.

Handle Bad Input

What happens when the LLM sends a bad ticker?

agents/simple_agent.py
queries = [
"What's Apple's current stock price?",
"How much is NVIDIA trading for right now?",
"What's Apple's stock price in Japanese yen?",
"What's the stock price of FAKECO?"
]
Output
Query: What's the stock price of FAKECO?
Step 1 Tool call: get_stock_price({"ticker": "FAKECO"})
HTTP Error 404: {"quoteSummary":{"result":null,"error":{"code":"Not Found","description":"Quote not found for symbol: FAKECO"}}}
✗ ERROR: No price found for 'FAKECO'. Verify the ticker symbol is correct.
Answer: It seems there is no stock price available for the ticker symbol "FAKECO." Please verify that the ticker symbol is correct or provide additional details if you meant a different company.

The tool raised an error. Our execute_tool_calls helper caught it and passed the error string back to the LLM as a ToolMessage. The model read the error, understood the problem, and explained it to the user in plain language instead of crashing.

This is why tools should return structured errors instead of letting exceptions propagate. The LLM can reason about an error message, call another tool, or explain the problem to the user.

When to Use What

Start with a workflow. Always. Only reach for an agent when the task requires chaining tools dynamically or when the number of steps can't be known in advance. The workflow is faster, cheaper, and predictable. The agent is flexible but burns more tokens and time. If you want to sleep at night and not debug LLM hallucinations, start with a workflow.

ConsiderationWorkflowAgent
LatencyFixed, predictableVariable, grows with tool calls
Cost1-5 LLM calls2-10+ LLM calls
DebuggingStraightforwardHarder (non-deterministic paths)
Tool chainingManual, you wire itAutomatic, LLM decides
Best forKnown steps, predictable inputAmbiguous tasks, dynamic tool use

Next Steps

You've built the simplest possible agent, a loop with tools. The next tutorial looks into correcting your agent when it makes mistakes. We'll upgrade it to a LangGraph implementation and a self-correcting loop.

Loading...

References

Footnotes

  1. Building Effective Agents (Anthropic)

  2. ReAct: Synergizing Reasoning and Acting in Language Models

  3. LangChain Tool Calling