Checklist: Agent Design¶
Before building an AI agent -- design decisions, failure prevention, and operational readiness.
Use this checklist before writing the first line of agent code. Agents fail differently than traditional software because they reason probabilistically and act autonomously. The seven failure modes documented here are drawn from real incidents, not hypotheticals. Working through each section forces you to design the failure path before the happy path.
Derived from the 7 Failure Modes of Agents framework -- Chapter 6.
Agent Type Selection¶
- Determined whether you are building a chat agent (user-facing, conversational) or a background agent (autonomous, unsupervised)
- Understood the different risk profiles: chat agents are more susceptible to hallucinated actions, scope creep, and context loss; background agents are more prone to infinite loops, cascading failures, resource exhaustion, and stale data
- Defined the agent's autonomy level using the permission spectrum: what can it do without approval, what requires confirmation, what is never allowed
- Established "done when" criteria -- the agent has a clear definition of task completion, not an open-ended mandate
- Determined whether the agent needs to interact with other agents, and if so, designed the communication pattern (hub-and-spoke with circuit breakers, not peer-to-peer)
Tool Design¶
- Created an allowlist of valid tools, APIs, and functions the agent can call -- anything not on the list is rejected by default
- Built a tool registry that validates all tool calls before execution; unknown tools fail fast with clear error messages
- Defined explicit input/output schemas for each tool so the agent can't pass malformed or unexpected parameters
- Tiered tool permissions by risk level: read-only operations are permitted freely, write operations require confirmation, destructive operations (DELETE, DROP) are never permitted in production
- Ensured tools return structured responses with explicit error codes that the agent can reason about, not ambiguous messages
Context Management¶
- Set a context degradation threshold: for chat agents, context summarization triggers every 10 turns to prevent drift
- Built explicit state checkpointing for background agents running multi-step or multi-day workflows
- Provided a mechanism for users to trigger "remind yourself what we discussed" to recover lost context
- Defined maximum conversation or task length -- context degrades meaningfully after 30-50 messages without active management
- Designed context storage so that critical information (user intent, constraints, prior decisions) is preserved outside the conversation window
Failure Prevention¶
Work through each of the seven failure modes and confirm mitigations are in place.
Hallucinated Actions¶
- All tool calls are validated against the tool registry before execution -- the agent can't call APIs, functions, or tools that don't exist
- Responses that reference policies, prices, or commitments are grounded in source documents, not generated from the model's training data
- Understood the liability precedent: your company is legally responsible for what its agents say (Air Canada, February 2024)
Infinite Loops¶
- Set a maximum iteration count (default: 10) for any looping or retry behavior
- Set a maximum timeout (default: 5 minutes) for any single task or subtask
- Built alerting that triggers after 3 iterations without progress -- not just after the limit is hit
- Retry logic includes exponential backoff, not blind retries at full speed
Scope Creep¶
- Each task has explicit "done when" criteria that the agent checks against
- Out-of-scope actions require explicit user confirmation before execution
- Permissions are tiered by task type -- the agent can't escalate its own authority
- Tested with adversarial prompts to confirm the agent doesn't agree to requests outside its mandate (Chevrolet dealership incident, December 2023)
Context Loss¶
- Context summarization runs automatically at defined intervals (every 10 turns for chat agents)
- Background agents checkpoint state explicitly at each major step
- Contradiction detection is in place: the agent flags when its current response conflicts with earlier statements
- Long-running tasks are broken into discrete stages with state persisted between stages
Cascading Failures¶
- Agents are isolated by default -- one agent's failure can't directly trigger failures in dependent agents or systems
- Inter-agent communication routes through a hub with circuit breakers: if one agent fails 3 times, it is isolated until manually reviewed
- No agent has DELETE or DROP TABLE permissions in production databases
- Understood the worst case: an agent with production database access and no isolation caused complete deletion of a production database after 9 days of erratic behavior (Replit incident, July 2025)
Resource Exhaustion¶
- Assigned a token budget per task with hard limits
- Built alerting at 80% of budget consumption -- before the limit is reached, not after
- Tasks exceeding limits are terminated with a clear explanation to the user or operator
- Cost monitoring infrastructure is in place before the agent is deployed -- not planned for later (73% of teams lack real-time cost tracking; enterprise overruns average 340%)
Stale Data¶
- Defined freshness requirements for every data source the agent depends on
- The agent checks data age before acting on any external data
- Refresh intervals are set for all data sources and enforced automatically
- Timestamp checking and inconsistency monitoring are active -- the agent flags when data sources contradict each other or appear outdated
Testing and Monitoring¶
- Built the four-layer resilience framework: Detection (dashboards and alerts for each failure mode), Prevention (guardrails in agent design), Recovery (graceful degradation and human handoff), Learning (post-incident analysis)
- Started with Detection first -- you can't prevent what you can't see
- Designed the failure path before the happy path: every agent has a graceful degradation plan and a human handoff trigger
- Tested the agent with adversarial inputs, not just golden-path scenarios
- Established a post-incident analysis process where every agent failure becomes a training example and a new test case
- Defined escalation criteria: what conditions trigger automatic human handoff vs. alert-only vs. agent self-recovery
- Monitoring covers all seven failure modes with distinct dashboards or alert channels -- not a single generic "agent health" metric
Next step: For each unchecked item, determine whether it is a blocker (must be resolved before deployment) or a follow-up (can be addressed in the first iteration cycle). No agent should reach production without the Failure Prevention section fully checked.