Skip to content

Chapter 6: Agent Architecture -- Resources

Curated resources for deeper exploration of topics covered in this chapter.

Frameworks from This Chapter

  • 7 Failure Modes of Agents -- Hallucinated actions, scope creep, context loss, infinite loops, cascading failures, resource exhaustion, and stale data -- with mitigations for each.

Tools & Platforms

Agent Frameworks & Orchestration

  • Temporal -- Workflow orchestration engine; used by Replit for agent task management serving 30M users.
  • Claude Code -- Agentic coding tool with plan mode, subagents, and hooks for automated review checkpoints.
  • Anthropic Agent SDK -- SDK for building production AI agents with structured tool calling.
  • AutoMCP -- Automated MCP tool generation; 19 lines of fixes took success rate from 76.5% to 99.9%.
  • Model Context Protocol (MCP) -- 1,000+ connectors for agent-tool integration; standard for chat and background agents.

Chat Agent Platforms

  • Intercom Fin -- AI customer support agent; improved from 25% to 66% resolution rate through iterative design.
  • Klarna AI Assistant -- Handled 2.3M conversations in first month; 700 FTE equivalent; $40M projected annual savings.

Background Agent Infrastructure

  • Kafka -- Event streaming platform; Netflix processes billions of daily events for inter-service communication.
  • RabbitMQ -- Message broker for async agent communication patterns.

Agent Monitoring & Safety

  • Helicone -- LLM observability for tracking agent costs and performance.
  • LangSmith -- Agent tracing and evaluation from LangChain.

Further Reading

Research & Data

Community & Learning

The 2 Agent Types

Characteristic Chat Agents Background Agents
Who waits Human is waiting No one is watching
Speed priority Response in seconds Throughput over latency
Error handling Clarify and retry Log, retry, alert
Autonomy Human-in-the-loop Autonomous with guardrails
Success metric Satisfaction, resolution rate Processing volume, accuracy
Example Klarna support, Intercom Fin Overnight data processing, report generation

The 5-Question Agent Decision Framework

Before building an agent, ask: 1. Does the task require reasoning (not just rules)? 2. Does it vary significantly across instances? 3. Does it scale to justify the overhead? 4. Can it tolerate occasional errors? 5. Does it follow a stable process?

If you answer "no" to any of these, consider alternatives to agents.

Key Incidents Referenced

Incident Company Failure Mode Lesson
Hallucinated bereavement policy Air Canada Hallucinated Actions Companies liable for AI agent statements
$1 Tahoe sale Chevrolet Scope Creep Agents need bounded authority
4,000 fake records + deleted DB Replit Cascading Failures Background agents need dead man's switches
Over-automation reversal Klarna Wrong agent type Chat tasks need human nuance; one agent doesn't fit all