What if LLMs Can Do Anything? - Agents and Workflows
Agentic systems is where language models become capable of automating complex tasks, making decisions, and solving problems autonomously. In this tutorial, we’ll explore how to harness the power of large language models not just as text generators, but as components in intelligent systems that can reason, plan, and interact with tools to accomplish meaningful work. You’ll learn the core architectural patterns for building these systems - from structured workflows to dynamic agents - and create your own working examples that demonstrate these powerful capabilities in action.
Tutorial Goals
In this tutorial, you will:
- Discover how AI agents work and what makes them powerful
- Learn when to use simple workflows vs intelligent agents
- Practice collecting real data from Reddit
- Create your first agentic workflow using LangGraph and Pydantic
- Build and connect custom tools to enhance your agent’s abilities
Agentic Systems
Agentic systems represent one of the most exciting frontiers in LLM applications, allowing AI to take on increasingly complex tasks with greater autonomy. Let’s explore what makes these systems powerful and how they’re revolutionizing automation.
Agentic systems are AI implementations that can perform tasks autonomously by making decisions, using tools, and following processes to achieve goals. They come in two primary architectural approaches:
Approach | Description | Best For |
---|---|---|
Workflows | Predefined systems where LLMs and tools follow specific code paths designed by developers | Predictable tasks with clear steps, reliability requirements, tight control needs |
Agents | More autonomous systems where LLMs dynamically direct their own processes and tool usage | Open-ended problems, tasks requiring adaptation, complex multi-step reasoning |
The key distinction lies in control - workflows follow developer-defined paths, while agents have more freedom to determine how they accomplish tasks.
Anthropic’s research on effective agent patterns shows that successful agent implementation focuses on simplicity, transparency, and well-defined tools. You can read their detailed findings in their Building Effective Agents guide.
At the core of any agentic system is the augmented LLM - a language model enhanced with additional capabilities:
- Retrieval: Accessing external knowledge
- Tool use: Performing actions through function calling
- Memory: Maintaining context over extended interactions
These augmentations enable LLMs to go beyond simple text generation to become intelligent systems that can reason, plan, and execute real-world tasks.
Think carefully before building an agentic system. They are not always the best solution for every problem. In many cases, a simple LLM prompt with retrieval is sufficient.
Here’s where agentic approaches excel when:
- Tasks require complex reasoning across multiple steps
- Problems benefit from dynamic planning and adaptation
- Work involves coordinating multiple tools or information sources
- Tasks are repetitive but require judgment and decision-making
The most successful agentic systems share these characteristics:
- Simplicity: Using the simplest approach possible for the task
- Transparency: Making the agent’s reasoning and planning steps visible
- Well-defined tools: Creating clear documentation and interfaces for tools
- Human oversight: Including appropriate checkpoints for human review
In our next sections, we’ll explore how to implement these systems using Pydantic for structured data handling and LangGraph for workflow orchestration.
Workflows
Workflow form the backbone of effective agentic systems, providing structured approaches to orchestrate LLMs and tools. These patterns help developers create predictable, reliable systems for complex tasks while maintaining control over the execution flow.
Sequential Workflow Pattern
The simplest workflow pattern is a sequential execution of steps, where the output of one step becomes the input for the next. This pattern is ideal for tasks with a clear, linear progression.
This sequential pattern is particularly effective when each step can be clearly defined and the entire process follows a predictable path. Notice how each step’s output becomes structured input for the next step.
from pydantic import BaseModel, Field
from typing import List
import ollama
class ArticleRequest(BaseModel):
topic: str = Field(..., description="The topic to write about")
audience: str = Field(..., description="Target audience (technical, general, etc.)")
class OutlineFormat(BaseModel):
sections: List[str] = Field(..., description="List of section headings")
key_points: List[str] = Field(..., description="Key points to cover")
class Article(BaseModel):
title: str = Field(..., description="Article title")
content: str = Field(..., description="Full article content")
# Step 1: Generate an outline
def create_outline(request: ArticleRequest) -> OutlineFormat:
response = ollama.chat(
model="llama3",
messages=[{
"role": "user",
"content": f"Create an outline for an article about {request.topic} for a {request.audience} audience."
}]
)
# Parse the response into our outline format
# (In practice, you'd use more structured function calling)
return OutlineFormat.model_validate_json(response["message"]["content"])
# Step 2: Write the full article based on the outline
def write_article(request: ArticleRequest, outline: OutlineFormat) -> Article:
prompt = f"""
Write an article about {request.topic} for a {request.audience} audience.
Use the following outline:
Sections:
{', '.join(outline.sections)}
Key points:
{', '.join(outline.key_points)}
"""
response = ollama.chat(model="llama3", messages=[{"role": "user", "content": prompt}])
# Parse the response into our article format
# (In practice, you'd use more structured function calling)
return Article.model_validate_json(response["message"]["content"])
# The full workflow
def article_workflow(request: ArticleRequest) -> Article:
outline = create_outline(request)
article = write_article(request, outline)
return article
# Example usage of the Sequential Workflow
def run_article_example():
# Create a request for an article about AI ethics
request = ArticleRequest(
topic="AI ethics in healthcare",
audience="healthcare professionals"
)
# Run the article generation workflow
article = article_workflow(request)
# Display the results
print(f"Generated Article Title: {article.title}\n")
print(f"Article Content:\n{article.content}")
# Execute the example
if __name__ == "__main__":
run_article_example()
This pattern allows for quality checks between steps. For example, you could verify that the outline meets certain criteria before proceeding to article generation.
Router Workflow Pattern
A router pattern classifies incoming requests and directs them to specialized handlers. This improves performance by allowing each handler to be optimized for specific types of requests.
The router pattern shines when you have diverse types of inputs that benefit from specialized handling. The initial classification step is critical for directing the workflow correctly.
from pydantic import BaseModel, Field
from enum import Enum
import ollama
class QueryType(Enum):
TECHNICAL_QUESTION = "technical_question"
CUSTOMER_SERVICE = "customer_service"
PRODUCT_INFO = "product_info"
OTHER = "other"
class UserQuery(BaseModel):
text: str = Field(..., description="User's original query")
class QueryClassification(BaseModel):
query_type: QueryType = Field(..., description="The type of query")
confidence: float = Field(..., description="Confidence score (0-1)")
# Step 1: Classify the query
def classify_query(query: UserQuery) -> QueryClassification:
response = ollama.chat(
model="llama3",
messages=[{
"role": "user",
"content": f"Classify the following query into one of these types: TECHNICAL_QUESTION, CUSTOMER_SERVICE, PRODUCT_INFO, or OTHER\n\nQuery: {query.text}"
}]
)
# Parse response to get classification
# (In practice, you'd use structured function calling)
return QueryClassification.model_validate_json(response["message"]["content"])
# Step 2: Route to specialized handlers
def handle_technical_question(query: UserQuery) -> str:
response = ollama.chat(
model="llama3",
messages=[{
"role": "user",
"content": f"Answer this technical question in detail with code examples if relevant: {query.text}"
}]
)
return response["message"]["content"]
def handle_customer_service(query: UserQuery) -> str:
response = ollama.chat(
model="llama3",
messages=[{
"role": "user",
"content": f"Respond to this customer service inquiry with empathy and solutions: {query.text}"
}]
)
return response["message"]["content"]
# The full workflow
def support_workflow(query: UserQuery) -> str:
classification = classify_query(query)
if classification.query_type == QueryType.TECHNICAL_QUESTION:
return handle_technical_question(query)
elif classification.query_type == QueryType.CUSTOMER_SERVICE:
return handle_customer_service(query)
elif classification.query_type == QueryType.PRODUCT_INFO:
return handle_product_info(query)
else:
return handle_general_query(query)
# Example usage of the Router Workflow
def run_support_example():
# Example technical question
tech_query = UserQuery(
text="How do I configure environment variables for my Docker container?"
)
# Example customer service question
cs_query = UserQuery(
text="I haven't received my order yet, and it's been two weeks since my purchase."
)
# Process both queries through the workflow
tech_response = support_workflow(tech_query)
cs_response = support_workflow(cs_query)
# Display the results
print("Technical Question Response:")
print(tech_response)
print("\n------------------\n")
print("Customer Service Response:")
print(cs_response)
# Execute the example
if __name__ == "__main__":
run_support_example()
This pattern improves accuracy by allowing each specialized handler to use prompts and context optimized for specific query types.
Evaluator-Feedback Workflow Pattern
In this pattern, one LLM generates a response while another evaluates it against specific criteria, allowing for iterative refinement until quality thresholds are met.
This pattern implements a natural feedback loop similar to human editing processes. The key advantage is systematic improvement based on objective evaluation criteria.
from pydantic import BaseModel, Field
import ollama
from typing import List
class ContentRequest(BaseModel):
topic: str = Field(..., description="Topic to write about")
style: str = Field(..., description="Writing style (formal, casual, etc.)")
word_count: int = Field(300, description="Target word count")
class EvaluationCriteria(BaseModel):
relevance: float = Field(..., description="Relevance to topic (0-10)")
style_match: float = Field(..., description="Adherence to requested style (0-10)")
quality: float = Field(..., description="Overall writing quality (0-10)")
feedback: List[str] = Field(..., description="Specific improvement suggestions")
# Step 1: Generate content
def generate_content(request: ContentRequest) -> str:
response = ollama.chat(
model="llama3",
messages=[{
"role": "user",
"content": f"Write about {request.topic} in a {request.style} style. Aim for about {request.word_count} words."
}]
)
return response["message"]["content"]
# Step 2: Evaluate content
def evaluate_content(content: str, request: ContentRequest) -> EvaluationCriteria:
prompt = f"""
Evaluate the following content based on these criteria:
1. Relevance to the topic: {request.topic}
2. Match to requested style: {request.style}
3. Overall writing quality
Content to evaluate:
{content}
Provide scores from 0-10 for each criterion and specific feedback for improvement.
"""
response = ollama.chat(model="llama3", messages=[{"role": "user", "content": prompt}])
# Parse response into evaluation criteria
# (In practice, you'd use structured function calling)
return EvaluationCriteria.model_validate_json(response["message"]["content"])
# Step 3: Improve content based on feedback
def improve_content(content: str, evaluation: EvaluationCriteria, request: ContentRequest) -> str:
prompt = f"""
Revise the following content based on this feedback:
{', '.join(evaluation.feedback)}
Original content:
{content}
Topic: {request.topic}
Style: {request.style}
Target word count: {request.word_count}
"""
response = ollama.chat(model="llama3", messages=[{"role": "user", "content": prompt}])
return response["message"]["content"]
# The full workflow
def content_workflow(request: ContentRequest, min_quality_threshold: float = 8.0) -> str:
content = generate_content(request)
evaluation = evaluate_content(content, request)
# Continue improving until we meet our quality threshold
attempts = 1
max_attempts = 3
while (evaluation.quality < min_quality_threshold and attempts < max_attempts):
content = improve_content(content, evaluation, request)
evaluation = evaluate_content(content, request)
attempts += 1
return content
# Example usage of the Evaluator-Feedback Workflow
def run_content_example():
# Create a request for marketing content
request = ContentRequest(
topic="Benefits of cloud computing for small businesses",
style="professional but approachable",
word_count=250
)
# Set a high quality threshold to trigger revisions
min_quality_threshold = 8.5
# Run the content generation workflow
final_content = content_workflow(request, min_quality_threshold)
# Display the results
print(f"Final content on {request.topic}:\n")
print(final_content)
print(f"\nGenerated to meet a quality threshold of {min_quality_threshold}")
# Execute the example
if __name__ == "__main__":
run_content_example()
This workflow pattern is especially valuable for content generation that must meet specific quality standards, allowing for systematic improvement based on objective criteria.
Each of these patterns can be used independently or combined to create more complex agentic systems tailored to specific use cases. The choice of pattern should be driven by the requirements of your application and the complexity of the tasks you need to automate.
Agents
Agents represent a more autonomous approach to agentic systems compared to workflows. While workflows follow predefined paths, agents have the freedom to determine their own execution strategy based on the task at hand. Agents can reason, plan, and execute actions dynamically, making them suitable for more open-ended tasks.
What Makes an Agent Different?
Unlike workflows where the developer defines the exact flow of operations, agents operate with more autonomy:
- They can reason about the task and break it down into subtasks
- They can choose which tools to use at runtime based on the current situation
- They can adapt to new information discovered during execution
- They can recover from errors by trying alternative approaches
Agents typically operate in a loop where they:
- Observe the current state
- Think about what to do next
- Choose a tool or action
- Execute that action
- Observe the result
- Update their understanding and repeat
Travel Planning Agent Example
Here’s a practical example of a travel planning agent that helps users organize their trips:
This agent example demonstrates how to coordinate multiple services through defined tools. Notice how the agent maintains state throughout the conversation and makes decisions about which tools to call based on the user’s needs.
from pydantic import BaseModel, Field
import ollama
from datetime import datetime
from typing import List, Dict
# Define tools
class FlightSearch(BaseModel):
origin: str = Field(..., description="Departure city or airport code")
destination: str = Field(..., description="Arrival city or airport code")
date: str = Field(..., description="Travel date in YYYY-MM-DD format")
class HotelSearch(BaseModel):
location: str = Field(..., description="City or area")
check_in: str = Field(..., description="Check-in date in YYYY-MM-DD format")
check_out: str = Field(..., description="Check-out date in YYYY-MM-DD format")
class WeatherCheck(BaseModel):
location: str = Field(..., description="City name")
date: str = Field(..., description="Date to check in YYYY-MM-DD format")
# Simplified tool implementations
def search_flights(tool_input: Dict) -> str:
# In a real system, this would call a flight API
return f"Found 3 flights from {tool_input['origin']} to {tool_input['destination']} on {tool_input['date']}: $320 (6am), $450 (1pm), $380 (8pm)"
def search_hotels(tool_input: Dict) -> str:
# In a real system, this would call a hotel API
return f"Found 5 hotels in {tool_input['location']} from {tool_input['check_in']} to {tool_input['check_out']}: Downtown Hotel ($180/night), Business Lodge ($140/night), City View ($210/night)"
def check_weather(tool_input: Dict) -> str:
# In a real system, this would call a weather API
return f"Weather forecast for {tool_input['location']} on {tool_input['date']}: Partly cloudy, 72°F (22°C), 10% chance of rain"
# Main travel agent function
def travel_agent(user_request: str, max_turns: int = 5) -> str:
tools = {
"search_flights": search_flights,
"search_hotels": search_hotels,
"check_weather": check_weather
}
tool_defs = [
{"name": "search_flights", "description": "Search for flights between cities"},
{"name": "search_hotels", "description": "Find hotel accommodations"},
{"name": "check_weather", "description": "Check weather forecast for a location"}
]
conversation = [{"role": "user", "content": user_request}]
# Agent loop
for _ in range(max_turns):
response = ollama.chat(
model="llama3",
messages=conversation,
tools=tool_defs
)
# Add agent's thinking to conversation
conversation.append({"role": "assistant", "content": response["message"]["content"]})
# Check if agent wants to use a tool
if "tool_calls" in response["message"]:
for tool_call in response["message"]["tool_calls"]:
tool_name = tool_call["name"]
tool_args = tool_call["parameters"]
# Call the appropriate tool
if tool_name in tools:
result = tools[tool_name](tool_args)
conversation.append({"role": "tool", "tool_call_id": tool_call["id"], "content": result})
# Check if we have a final answer
if "I recommend the following itinerary" in response["message"]["content"]:
break
# Get final recommendation
final = ollama.chat(
model="llama3",
messages=conversation + [{"role": "user", "content": "Please provide your final travel itinerary recommendation."}]
)
return final["message"]["content"]
# Sample usage
if __name__ == "__main__":
user_query = "I want to plan a 3-day trip to Chicago from New York next week, from June 15-18. I need flights and hotel recommendations, and I'd like to know what weather to expect."
itinerary = travel_agent(user_query)
print("\nTravel Agent Recommendation:")
print(itinerary)
This travel agent example demonstrates a practical use case where the agent:
- Extracts key trip details from the user’s request
- Decides which services to query and in what order
- Gathers flight options, hotel availability, and weather forecasts
- Synthesizes this information into a complete travel itinerary
The agent makes its own decisions about which tools to call and when, adapting to the specific details in the user’s request. For example, it might check weather first for a beach destination but prioritize flight availability for a business trip.
This example shows how even a relatively simple agent can provide real value by coordinating multiple services to solve a complex, multi-step planning problem.
Let’s look at a couple of real-world applications where agents have proven particularly valuable:
Customer Support Automation
Customer support is a natural fit for agentic systems because:
- Support interactions follow a conversation flow while requiring external information
- Tools can integrate customer data, order history, and knowledge base articles
- Specific actions like issuing refunds can be handled programmatically
- Success is measurable through resolution rates and customer satisfaction
Several companies now use agentic systems that handle routine inquiries autonomously while escalating complex cases to human agents, creating hybrid support models that blend AI efficiency with human expertise.
Software Development Assistance
Coding tasks benefit from agentic approaches because:
- Code solutions are verifiable through automated tests
- Agents can iterate based on compilation errors or test results
- The problem space is well-defined with clear success criteria
- Output quality can be objectively measured
Anthropic’s research on SWE-bench tasks demonstrates how agents can solve real GitHub issues based solely on pull request descriptions.
Human-in-the-loop
You should almost always start with a human-in-the-loop system (when you can). This means that you should always have a human in the loop to verify the output of the agent before it is sent to the user. This will help you catch any errors or mistakes that the agent may make. After some time, you can remove the human from the system and let the agent run on its own.
Human-in-the-loop systems enable safety and quality control by catching errors obvious to people but not to AI, while simultaneously generating valuable training data from each oversight instance. Human oversight enables responsible deployment by allowing teams to gradually increase automation as the system proves its reliability in real-world scenarios, creating a sustainable path from supervised to autonomous operation.
The best human-in-the-loop implementations add review points at strategic locations:
- Before Tool Execution: Review high-risk actions before execution
- At Decision Branches: Validate important routing decisions
- Before Final Output: Verify the final response before sending to users
Here’s a high level overview of a human-in-the-loop system:
def agent_workflow_with_review(user_query):
# Generate agent response
agent_response = generate_agent_response(user_query)
# Human review step
if needs_review(agent_response):
return get_human_approval(agent_response)
# Otherwise return the response directly
return agent_response
Human-in-the-loop systems should evolve over time through a graduated approach to oversight: from full review, to moving to selective review of only uncertain or high-risk cases, and finally transitioning to exception handling and random spot checking.
Here are some practical implementation tips:
- Batch Processing: Group similar items for more efficient review
- Review Analytics: Track what gets flagged for review to identify patterns
- Feedback Loops: Ensure reviewer corrections inform system improvements
Most of these will help you sleep better at night.
Conclusion
As we’ve explored in this tutorial, agentic systems represent a powerful evolution in how we can leverage LLMs - moving from simple text generation to intelligent, goal-oriented automation. Whether you choose structured workflows or more autonomous agents depends on your specific use case requirements:
- Use workflows when: You need predictability, reliability, and tight control over execution paths
- Use agents when: You need flexibility to handle diverse or open-ended tasks that benefit from dynamic planning
Remember that successful implementations follow these principles:
- Start with the simplest approach possible
- Make reasoning and planning steps transparent
- Define clear interfaces for all tools
- Include appropriate human oversight
We’ll look at how to implement a complete agentic workflow in the next section.