Building a Verifiable AI Agent with the ReAct Framework

In Part 1, we covered how LLMs use function calling to return structured data and orchestrate multiple tools. But there's a problem with the basic pattern: the agent calls a tool, gets a "success" response, and moves on. What if the tool said it succeeded but the database wasn't actually updated? What if the API returned 200 but the record is stale?

This article introduces the Verification Loop — a pattern where the agent confirms every action before reporting success to the user.

The Problem: Fire and Forget

Here's the typical AI agent flow in most tutorials:

User: "Update my shipping address to 742 Evergreen Terrace"
  │
  ▼
LLM: Call update_address(user_id=123, new_address="742 Evergreen Terrace")
  │
  ▼
Tool returns: {"status": "success"}
  │
  ▼
LLM: "Done! Your address has been updated."

Looks fine. But what could go wrong?

The database transaction rolled back silently
A concurrent write overwrote the address milliseconds later
The tool returned "success" but wrote to a staging table, not production
The API returned 200 but the downstream service was down

The agent has no idea if the change actually persisted. It trusted a single "success" string and told the user everything was fine.

The Solution: Check-Action-Verify

The Verification Loop adds one critical step: read back the data after writing it.

        ┌──────────────────────────────────┐
        │                                  │
        ▼                                  │
   ┌─────────┐    ┌─────────┐    ┌────────────────┐
   │ Analyze  │───→│  Act    │───→│    Verify      │
   │ (Read)   │    │ (Write) │    │ (Read back)    │
   └─────────┘    └─────────┘    └────────────────┘
                                    │          │
                                  Match?     Mismatch?
                                    │          │
                                    ▼          ▼
                                 Continue    Retry / Report

The agent operates in a cycle:

Analyze — Understand what the user wants
Act — Call a tool to perform the action (e.g., update_order_status)
Verify — Call a separate tool to confirm the change (e.g., get_order_details)
Observe — If verification fails, loop back to retry or report the specific error

Two Types of Tools

To support this pattern, we define tools in pairs — one that writes, one that reads:

Tool Name	Type	Purpose
`update_record`	Action (Write)	Changes the status of a ticket or order
`fetch_record`	Verification (Read)	Checks the current status to confirm the update took effect
`send_email`	Action (Write)	Sends a reply or notification email
`check_sent_status`	Verification (Read)	Confirms the email appears in the sent log

The key insight: Action tools change state. Verification tools read state. They always come in pairs.

Both are registered as tools the LLM can call. The LLM learns to use them in sequence through its system prompt and the ReAct reasoning pattern.

The ReAct Pattern: How the LLM Thinks Through It

ReAct stands for Reason + Act. Instead of just calling tools, the LLM explicitly reasons about what it's doing at each step. In the message history, this looks like:

With Verification (correct)

Thought:  "The user wants to update their address. I'll call the update tool."
Action:   Call update_address(user_id=123, new_address="742 Evergreen Terrace")
Observation: Tool returns {"status": "success"}

Thought:  "The tool said success, but I should verify the change was saved."
Action:   Call get_user_profile(user_id=123)
Observation: {"address": "742 Evergreen Terrace", "updated_at": "2026-02-18T10:30:00"}

Thought:  "The address matches my update. Verified."
Final Answer: "I've updated and verified your address to 742 Evergreen Terrace."

Without Verification (risky)

Thought:  "The user wants to update their address."
Action:   Call update_address(user_id=123, new_address="742 Evergreen Terrace")
Observation: Tool returns {"status": "success"}

Final Answer: "Done! Your address has been updated."
# But was it really? The agent doesn't know.

When Verification Fails

Thought:  "The user wants to update their address."
Action:   Call update_address(user_id=123, new_address="742 Evergreen Terrace")
Observation: Tool returns {"status": "success"}

Thought:  "Let me verify the change."
Action:   Call get_user_profile(user_id=123)
Observation: {"address": "123 Old Street", "updated_at": "2026-02-17T08:00:00"}

Thought:  "The address didn't change — still shows the old value. The update
           may have been rolled back. Let me try again."
Action:   Call update_address(user_id=123, new_address="742 Evergreen Terrace")
Observation: Tool returns {"status": "success"}

Thought:  "Verify again."
Action:   Call get_user_profile(user_id=123)
Observation: {"address": "742 Evergreen Terrace", "updated_at": "2026-02-18T10:31:00"}

Thought:  "Confirmed on second attempt."
Final Answer: "Your address has been updated to 742 Evergreen Terrace.
              (Note: the first attempt didn't persist — it took a retry.)"

The agent catches the failure, retries, and is transparent with the user about what happened.

Implementation: Action + Verification Tool Pair

Here's a concrete example — sending a notification email and verifying it was delivered:

# === Action Tool — changes state ===
def send_notification(user_id: int, subject: str, body: str) -> dict:
    """Send an email notification to the user."""
    result = email_service.send(
        to=get_user_email(user_id),
        subject=subject,
        body=body
    )
    return {"status": "sent", "message_id": result.message_id}


# === Verification Tool — reads state ===
def verify_notification_sent(message_id: str) -> dict:
    """Check that a notification was actually delivered."""
    record = db.query(
        "SELECT status, sent_at FROM outgoing_emails WHERE message_id = %s",
        (message_id,)
    )
    if record and record["status"] == "delivered":
        return {
            "verified": True,
            "status": record["status"],
            "sent_at": record["sent_at"].isoformat()
        }
    return {
        "verified": False,
        "status": record["status"] if record else "not_found"
    }

The agent calls send_notification, gets a message_id, then calls verify_notification_sent with that ID. Only when verified: True does the agent tell the user the task is complete.

What the Verification Checks For

Different actions need different verification signals:

Action	What Verification Checks	Failure Means
Update database record	Read the record back, compare fields	Transaction rolled back or concurrent overwrite
Send email	Check "delivered" status in outgoing log	SMTP failure, bounced, or queued but not sent
Create ticket	Fetch ticket by ID, confirm it exists	API rate limit, auth failure, or validation error
Publish to queue	Check message exists in stream	Redis connection dropped, message not persisted

The Verification Loop in Code

Here's how to integrate the pattern into an agent's action dispatch:

async def execute_with_verification(self, action, email, result):
    """Execute an action and verify it completed successfully."""
    max_retries = 2

    for attempt in range(max_retries + 1):
        # Act
        action_result = await action.execute(email, result)

        # Verify (if the action supports verification)
        if hasattr(action, 'verify'):
            verification = await action.verify(action_result)

            if verification.get("verified"):
                logger.info("Action %s verified on attempt %d",
                           type(action).__name__, attempt + 1)
                return action_result
            else:
                logger.warning(
                    "Verification failed for %s (attempt %d): %s",
                    type(action).__name__, attempt + 1,
                    verification.get("status")
                )
                if attempt < max_retries:
                    continue  # retry the action
                else:
                    raise VerificationError(
                        f"Action {type(action).__name__} failed verification "
                        f"after {max_retries + 1} attempts"
                    )
        else:
            # No verification method — trust the action result
            return action_result

This gives you:

Automatic retries when verification fails
Clear error reporting when all retries are exhausted
Backward compatibility — actions without a verify method still work normally

Wiring It Into the Agent Loop

class VerifiableAgent:
    def __init__(self, tools, llm_client):
        self.tools = tools
        self.client = llm_client
        self.messages = []

    async def run(self, user_message: str) -> str:
        self.messages.append({"role": "user", "content": user_message})

        while True:
            response = self.client.chat.completions.create(
                model="gpt-4o",
                messages=self.messages,
                tools=self.tools,
            )

            message = response.choices[0].message
            self.messages.append(message)

            # If no tool calls, return the text response
            if not message.tool_calls:
                return message.content

            # Execute each tool call
            for tc in message.tool_calls:
                handler = self.tool_handlers[tc.function.name]
                args = json.loads(tc.function.arguments)
                result = await handler(**args)

                self.messages.append({
                    "role": "tool",
                    "tool_call_id": tc.id,
                    "content": json.dumps(result)
                })

            # Loop back — LLM will see tool results and decide:
            # verify, retry, or generate final answer

The key is the while True loop. After executing tools, control goes back to the LLM. It sees the tool results and decides whether to:

Call a verification tool (the loop continues)
Call the action tool again (retry)
Generate a final text answer (the loop exits)

The LLM's reasoning drives the verification logic. You don't hardcode "verify after every action" — the LLM learns this behavior from its system prompt.

System Prompt: Teaching the Agent to Verify

The ReAct behavior is guided by the system prompt:

system_prompt = """You are a helpful assistant that can update user records
and send notifications. You have access to action tools and verification tools.

IMPORTANT: After every action that changes data, you MUST verify the change
by calling the corresponding read tool. Do NOT tell the user an action
succeeded until you have verified it.

If verification fails:
1. Retry the action (up to 2 times)
2. If it still fails, tell the user exactly what went wrong

Always follow this pattern:
- Thought: What do I need to do?
- Action: Call the write tool
- Observation: Check the result
- Thought: I should verify this
- Action: Call the read tool
- Observation: Does it match?
- Final Answer: Only after verification passes
"""

Why This Matters

Without verification, your agent is essentially saying "I called the function and it didn't throw an error, so it must have worked." That's like mailing a letter and assuming it arrived because the mailbox didn't reject it.

The Verification Loop turns your agent from optimistic ("it probably worked") to confirmed ("I checked, and it worked"). In production systems handling real user data, that difference is the gap between a demo and a reliable product.

Key Takeaways

Never trust "fire and forget." The Verification Loop (Check-Action-Verify) ensures every state-changing operation is confirmed with a separate read-back.
Action and Verification tools come in pairs. Action tools write state, Verification tools read it back. Both are registered as tools the LLM can call.
ReAct = Reason + Act. The LLM explicitly reasons about each step — what it needs to do, what the result means, and whether to verify or retry.
The LLM drives the loop, not hardcoded logic. The system prompt teaches verification behavior. The while True loop lets the LLM decide when it has enough confidence to give a final answer.
Be transparent about retries. If the agent needed multiple attempts, tell the user. Trust is built through honesty, not by hiding failures.

This is Part 2 of a series on AI agents. Part 1 covers function calling and multi-tool orchestration.