Back to notes
Agent safetyChecklist6 min

Add guardrails to multi-step agents

How to constrain agent behavior around data access, tool use, user intent, and final actions.

Open source doc
Real example

Example: prevent an agent from sending an unapproved proposal email

A multi-step agent drafts a proposal response and has access to email tooling. Sending the draft without review would create business risk.

Classify the email tool as external-send and require human approval. The agent can prepare the draft and checklist, but the send action remains disabled until a reviewer approves.

The agent accelerates work without crossing the boundary from recommendation to irreversible action.

Tutorial path

How to implement it

Step 01
Classify incoming requests by allowed, needs clarification, restricted, or unsafe.
Step 02
Validate every tool call against user permissions and workflow state.
Step 03
Require approval before destructive, financial, external-send, or production-write actions.
Step 04
Validate final outputs for format, evidence, and disallowed claims.
Step 05
Route blocked work to a human path with enough context to continue.
Checklist

Ready when these are true

Input classification
Tool authorization
Approval gates
Output validation
Human fallback path
Field notes

What matters in practice

01
Guardrails are most useful at boundaries: input, tool call, output, and final action.
02
The system should reject unsafe actions before a tool executes, not after.
03
A good fallback is explicit and useful, not a generic apology.
Avoid these mistakes

Common failure modes

01
Do not put approval rules only in prompt text.
02
Do not give agents destructive tools without server-side gates.
03
Do not make blocked actions disappear; explain the safe next step.
Practical tip
Guardrails belong in code and product state. Prompts can describe them, but code must enforce them.
Apply this to a build
Contact
Bring the workflow, deadline, and constraints.
Send the desired outcome, current bottleneck, users, and timeline. I will respond with a practical path for the build.