AI agent verification and security

Pre-execution verification for AI agents. Use this guide when you need AI agent security, zero-trust approval flows, tool call verification, and runtime policy enforcement before an agent touches external systems.

Overview

QWED Agent Verification provides:

Pre-execution checks before agents act
Budget enforcement to limit costs
Risk assessment for each action
Activity logging for audit trails

Registering an agent

from qwed_sdk import QWEDClient

client = QWEDClient(api_key="qwed_...")

agent = client.register_agent(
    name="DataAnalyst",
    type="supervised",  # supervised, autonomous, trusted
    principal_id="user_123",
    permissions={
        "allowed_engines": ["math", "logic", "sql"],
        "blocked_tools": ["execute_code"],
    },
    budget={
        "max_daily_cost_usd": 100,
        "max_requests_per_hour": 500,
    }
)

print(agent["agent_id"])     # agent_abc123
print(agent["agent_token"])  # qwed_agent_xyz...

Verifying actions

Before an agent executes an action, you must provide an ActionContext with a conversation_id and a monotonically increasing step_number. These fields are required — requests without them are rejected.

decision = client.verify_action(
    agent_id="agent_abc123",
    action={
        "type": "execute_sql",
        "query": "SELECT * FROM users"
    },
    context={
        "conversation_id": "conv_xyz",
        "step_number": 1,
        "user_intent": "Get user list"
    }
)

if decision["decision"] == "APPROVED":
    execute_query(query)
elif decision["decision"] == "DENIED":
    print("Action blocked:", decision["error"])
elif decision["decision"] == "PENDING":
    request_human_approval()
elif decision["decision"] == "BUDGET_EXCEEDED":
    print("Budget limit reached:", decision["error"])

Conversation controls

QWED enforces runtime guardrails that prevent agents from replaying actions, running in infinite loops, or exceeding conversation length limits. These checks run automatically on every verify_action call.

How it works

Each call to verify_action must include a conversation_id (identifying the current session) and a step_number (a positive integer that increases with each action in that session). QWED uses these fields to enforce four controls:

Control	Limit	Error code
Conversation length	50 steps per conversation	`QWED-AGENT-LOOP-001`
Replay detection	Each step number can only be used once	`QWED-AGENT-LOOP-002`
Repetitive loop detection	Max 2 consecutive identical actions	`QWED-AGENT-LOOP-003`
No-progress doom loop	Same action on unchanged state ≥ 3 times	`QWED-AGENT-LOOP-004`

Incrementing steps correctly

The step_number must be strictly greater than any previously committed step within the same conversation. If an action is denied (for example, due to a loop), that step number is not consumed — you can retry the same step with a different action.

# Step 1: approved
client.verify_action(
    agent_id="agent_abc123",
    action={"type": "calculate", "query": "2+2"},
    context={"conversation_id": "conv_1", "step_number": 1}
)

# Step 2: same action, still approved (first repeat)
client.verify_action(
    agent_id="agent_abc123",
    action={"type": "calculate", "query": "2+2"},
    context={"conversation_id": "conv_1", "step_number": 2}
)

# Step 3: same action again — denied as repetitive loop
result = client.verify_action(
    agent_id="agent_abc123",
    action={"type": "calculate", "query": "2+2"},
    context={"conversation_id": "conv_1", "step_number": 3}
)
# result["decision"] == "DENIED"
# result["error"]["code"] == "QWED-AGENT-LOOP-003"

# Step 3 retry: a different action succeeds on the same step number
client.verify_action(
    agent_id="agent_abc123",
    action={"type": "verify_logic", "query": "x > 1"},
    context={"conversation_id": "conv_1", "step_number": 3}
)

If your agent framework retries failed actions automatically, make sure it does not reuse the same step_number for a previously approved step. Replayed step numbers are always rejected.

Progress-aware doom loop detection (LOOP-004)

New in v5.1.0

LOOP-003 catches agents that repeat the same action consecutively, but it cannot detect an agent that keeps retrying an action when the underlying system state has not changed. LOOP-004 addresses this by binding each action to the world state at the time it was proposed. To enable LOOP-004, include pre_action_state_hash and state_source in the action context:

import hashlib

# Compute a state hash from your environment
db_snapshot = get_database_checksum()
state_hash = hashlib.sha256(db_snapshot.encode()).hexdigest()

decision = client.verify_action(
    agent_id="agent_abc123",
    action={"type": "execute_sql", "query": "UPDATE orders SET status = 'shipped'"},
    context={
        "conversation_id": "conv_xyz",
        "step_number": 4,
        "pre_action_state_hash": state_hash,
        "state_source": "db_snapshot",
    }
)

The guard tracks a sliding window of the last 20 action+state fingerprints per conversation. If the same fingerprint appears 3 or more times, the action is halted with QWED-AGENT-LOOP-004. Accepted state_source values:

Value	Use case
`file_tree`	Git tree hash or directory listing hash
`db_snapshot`	Database state checksum
`conversation_digest`	Hash of the conversation history
`git_tree`	Git tree object hash
`custom`	Any caller-defined canonical hash

Validation rules:

pre_action_state_hash must be a 64-character lowercase hex SHA-256 digest
Both pre_action_state_hash and state_source must be provided together — supplying only one is rejected
During gradual rollout, both fields are optional. When the server enables DOOM_LOOP_GUARD_REQUIRED, they become mandatory

LOOP-004 fingerprints are only committed to the sliding window when the action decision is APPROVED. Denied and pending actions do not affect the history, preventing false positives from rejected retries.

Trust levels

Level	Value	Description
UNTRUSTED	0	No autonomous actions
SUPERVISED	1	Low-risk autonomous
AUTONOMOUS	2	Most actions autonomous
TRUSTED	3	Full autonomy

Tool approval policy

Changed in v5.0.2

The ToolApprovalSystem classifies every tool call into one of three categories before execution:

Category	Behavior	Examples
Safe (allowlisted)	Auto-approved	`read_database`, `query_data`, `search_web`, `send_email`, `log_message`, `get_weather`
Dangerous (blocklisted)	Blocked — requires manual approval	`delete_database`, `drop_table`, `send_money`, `delete_files`, `shutdown_server`, `revoke_access`
Unknown	Blocked — requires explicit allowlisting	Any tool not in the safe or dangerous list

Unknown tools are denied by default, regardless of their heuristic risk score. Previously, unknown tools with a low risk score (below 0.3) were auto-approved. This fail-closed behavior ensures that new or unexpected tools cannot execute without being explicitly added to the allowlist. When an unknown tool is blocked, the response includes the tool name and the computed risk score for debugging:

{
  "approved": false,
  "blocked_reason": "Unknown tool 'my_custom_tool' requires explicit allowlisting (risk_score=0.2)"
}

To allow a custom tool, add it to the safe operations list in your ToolApprovalSystem configuration.

Risk assessment

Actions are assessed for risk:

Risk	Examples
LOW	read_file, database_read
MEDIUM	send_email, api_call
HIGH	file_write, database_write
CRITICAL	execute_code, file_delete, DROP

Decision matrix

Trust Level	LOW Risk	MEDIUM Risk	HIGH Risk	CRITICAL Risk
0 (Untrusted)	PENDING	DENIED	DENIED	DENIED
1 (Supervised)	APPROVED	PENDING	DENIED	DENIED
2 (Autonomous)	APPROVED	APPROVED	PENDING	DENIED
3 (Trusted)	APPROVED	APPROVED	APPROVED	APPROVED

Tool approval policy

Changed in v5.0.2

The tool approval system categorizes every tool call into one of three groups before execution:

Category	Behavior	Examples
Safe operations	Auto-approved	`read_database`, `query_data`, `search_web`
Dangerous operations	Blocked, requires manual approval	`delete_database`, `drop_table`, `send_money`
Unknown operations	Blocked (default-deny)	Any tool not in the safe or dangerous list

Unknown tools are always denied, regardless of their computed risk score. The blocked response includes the tool name and risk score for debugging:

Unknown tool 'my_custom_tool' requires explicit allowlisting (risk_score=0.2)

Before v5.0.2, unknown tools with a risk score below 0.3 were auto-approved. If your agents rely on custom tools that were previously approved through this heuristic, you must add them to the safe operations allowlist. See the changelog for migration details.

Adding tools to the allowlist

agent = client.register_agent(
    name="DataAnalyst",
    type="supervised",
    principal_id="user_123",
    permissions={
        "allowed_engines": ["math", "logic", "sql"],
        "allowed_tools": ["my_custom_tool", "fetch_report"],
        "blocked_tools": ["execute_code"],
    },
    budget={"max_daily_cost_usd": 100}
)

Budget enforcement

# Check remaining budget
budget = client.get_agent_budget("agent_abc123")
print(budget)
# {
#   "cost": {"max_daily_usd": 100, "current_daily_usd": 45.50},
#   "requests": {"max_per_hour": 500, "current_hour": 123}
# }

Activity logging

# Get agent activity
activity = client.get_agent_activity("agent_abc123", limit=10)
for entry in activity:
    print(f"{entry['timestamp']}: {entry['action_type']} -> {entry['decision']}")

Runtime hardening

New in v5.0.0

QWED enforces several runtime controls to prevent agent misuse, infinite loops, and replay attacks. These protections operate at the verification kernel level and cannot be bypassed by agents.

Action context enforcement

Every verify_action call requires an ActionContext with:

Field	Type	Required	Description
`conversation_id`	string	Yes	Unique identifier for the conversation/session
`step_number`	integer	Yes	Monotonically increasing step counter (must be >= 1)
`user_intent`	string	No	Human-readable description of the user’s goal
`pre_action_state_hash`	string	Conditional	SHA-256 hex digest of the world state before the action. Required when `state_source` is provided
`state_source`	string	Conditional	How the hash was derived: `file_tree`, `db_snapshot`, `conversation_digest`, `git_tree`, or `custom`. Required when `pre_action_state_hash` is provided

The step number must increase with each action in a conversation. Attempts to reuse or decrement step numbers are rejected.

Registered action enforcement

New in v5.1.1

Every verify_action call must reference an action_type that QWED has registered semantics for. Action types must be either bound to a verification engine or registered as a governed tool with a known risk level. Engine examples: verify_math → math, execute_sql → sql. Tool examples: read_database, send_email. Action types outside those two registries are denied with QWED-AGENT-ACTION-001 before any risk assessment runs.

result = client.verify_action(
    agent_id="agent_abc123",
    action={"type": "do_arbitrary_thing", "query": "..."},
    context={"conversation_id": "conv_1", "step_number": 1},
)
# {
#   "decision": "DENIED",
#   "error": {
#     "code": "QWED-AGENT-ACTION-001",
#     "message": "Unknown action_type 'do_arbitrary_thing' cannot be verified without explicit registered semantics"
#   }
# }

The denial fires before risk assessment, so unknown actions can never be auto-approved through a permissive default. The reserved conversation step is released, allowing the agent to retry the same step with a registered action type. To allow a custom action, either bind it to an engine in ACTION_ENGINES or add it to your tool registry with an explicit risk level. Tools registered this way are reported with engine: "tool_control" in the verification response.

Before v5.1.1, unknown action types defaulted to engine: "security" and proceeded through risk assessment as MEDIUM risk, which could result in APPROVED or PENDING decisions depending on the agent’s trust level. Audit any production agents that emit non-standard action_type values and register them explicitly before upgrading.

Replay and loop detection

The agent service detects and blocks four types of problematic patterns:

Pattern	Error code	Description
Step replay	`QWED-AGENT-LOOP-002`	Submitting an action with a `step_number` that was already used in the conversation
Repetitive loop	`QWED-AGENT-LOOP-003`	Submitting the same action (identical fingerprint) more than 2 consecutive times
No-progress doom loop	`QWED-AGENT-LOOP-004`	Repeating the same action on an unchanged world state 3 or more times (requires `pre_action_state_hash`)
Step limit exceeded	`QWED-AGENT-LOOP-001`	Exceeding the maximum of 50 steps per conversation

Actions are fingerprinted deterministically using action_type, query, code, target, and parameters. When pre_action_state_hash is provided, the fingerprint also incorporates the world state hash. If a loop is detected, the conversation state is not advanced — the agent can recover by submitting a different action at the same step number.

# Step 1: approved
client.verify_action(agent_id, action={"type": "calculate", "query": "2+2"},
    context={"conversation_id": "conv_1", "step_number": 1})

# Step 1 again: DENIED (replay)
client.verify_action(agent_id, action={"type": "calculate", "query": "2+2"},
    context={"conversation_id": "conv_1", "step_number": 1})
# -> {"decision": "DENIED", "error": {"code": "QWED-AGENT-LOOP-002"}}

Fail-closed for unknown action types

QWED denies any verify_action call whose action_type has no registered engine binding or tool risk level. Action verification is deterministic — actions without explicit semantics cannot be risk-assessed or routed to a verification engine, so the kernel returns a denial instead of falling back to a permissive default.

result = client.verify_action(
    agent_id,
    action={"type": "transfer_funds_internal_v2", "query": "Move funds between ledgers"},
    context={"conversation_id": "conv_42", "step_number": 1},
)
# -> {"decision": "DENIED", "error": {"code": "QWED-AGENT-ACTION-001", "message": "..."}}

A denial under QWED-AGENT-ACTION-001 releases the in-flight step reservation, so the agent may retry the same step number with a registered action type. Registered action types include the entries in ACTION_ENGINES (such as calculate, verify, prove) and tools listed in TOOL_RISK_LEVELS.

In-flight reservation system

While QWED processes a verify_action call, it reserves the step number so concurrent requests cannot claim the same step. QWED releases the reservation if the action is denied, allowing the agent to retry with a different action at the same step.

Budget denial behavior

When a budget check fails, the conversation step is not consumed. This means the agent can retry the same step number after the budget resets without triggering a replay detection error.

Fail-closed rate limiting

The Redis-backed sliding window rate limiter fails closed when Redis is unavailable. If the Redis backend encounters an error, all requests are denied rather than allowed, preventing uncontrolled access during infrastructure failures. When Redis is entirely absent at startup, a local in-memory fallback limiter is used instead.

Environment integrity verification

On API server startup, QWED runs an environment integrity check (via StartupHookGuard) before initializing the database. If the environment is compromised, the server refuses to start. This prevents operation in tampered runtime environments.

Timing-safe token verification

Agent token verification uses hmac.compare_digest for constant-time comparison, preventing timing side-channel attacks against agent authentication.

Fail-closed on unknown actions

QWED only verifies actions whose action_type has explicit, registered semantics. If you submit an action_type that is not bound to a verification engine or a known tool, the request is denied with QWED-AGENT-ACTION-001 before risk assessment runs. Registered actions fall into two categories:

Category	Examples	Engine
Verification engines	`execute_sql`, `execute_code`, `calculate`, `verify_logic`, `verify_fact`	`sql`, `code`, `math`, `logic`, `fact`
Tool calls	`database_read`, `database_write`, `send_email`, `file_read`, `file_write`, `file_delete`, `api_call`	`tool_control`

Anything else — including custom action names, typos, and forward-compatible names that the runtime does not yet recognize — is denied:

result = client.verify_action(
    agent_id="agent_abc123",
    action={"type": "transfer_funds_internal_v2", "query": "Move funds between ledgers"},
    context={"conversation_id": "conv_1", "step_number": 1},
)

# {
#   "decision": "DENIED",
#   "error": {
#     "code": "QWED-AGENT-ACTION-001",
#     "message": "Unknown action_type 'transfer_funds_internal_v2' cannot be verified without explicit registered semantics"
#   }
# }

Key behaviors to plan for:

No verification block is returned. Unknown actions never receive an engine label or a VERIFIED status — there is no generic "security" fallback.
The step reservation is released. Because the action was denied, the same step_number can be retried with a registered action (no QWED-AGENT-LOOP-002 replay error).
Map custom intents to registered actions. If your agent needs to perform a domain-specific operation, route it through one of the registered verification engines or tool calls rather than inventing a new action_type string.

Framework integration

LangChain

from qwed_sdk.langchain import QWEDVerificationCallback

agent = initialize_agent(
    tools=[...],
    callbacks=[QWEDVerificationCallback(agent_id="agent_abc123")]
)

CrewAI

from qwed_sdk.crewai import QWEDVerifiedAgent

analyst = QWEDVerifiedAgent(
    role="Analyst",
    goal="Analyze data",
    agent_id="agent_abc123"
)

​Overview

​Registering an agent

​Verifying actions

​Conversation controls

​How it works

​Incrementing steps correctly

​Progress-aware doom loop detection (LOOP-004)

​Trust levels

​Tool approval policy

​Risk assessment

​Decision matrix

​Tool approval policy

​Adding tools to the allowlist

​Budget enforcement

​Activity logging

​Runtime hardening

​Action context enforcement

​Registered action enforcement

​Replay and loop detection

​Fail-closed for unknown action types

​In-flight reservation system

​Budget denial behavior

​Fail-closed rate limiting

​Environment integrity verification

​Timing-safe token verification

​Fail-closed on unknown actions

​Framework integration

​LangChain

​CrewAI

Overview

Registering an agent

Verifying actions

Conversation controls

How it works

Incrementing steps correctly

Progress-aware doom loop detection (LOOP-004)

Trust levels

Tool approval policy

Risk assessment

Decision matrix

Tool approval policy

Adding tools to the allowlist

Budget enforcement

Activity logging

Runtime hardening

Action context enforcement

Registered action enforcement

Replay and loop detection

Fail-closed for unknown action types

In-flight reservation system

Budget denial behavior

Fail-closed rate limiting

Environment integrity verification

Timing-safe token verification

Fail-closed on unknown actions

Framework integration

LangChain

CrewAI