execute_python_code
Execute Python code in a sandboxed subprocess with access to all QWED SDK libraries.
As of v0.2.0, execute_python_code is the single MCP tool exposed by QWED-MCP. It replaces all previous verify_* tools to solve context bloat (RFC-9728 compatibility). See migration from v0.1.x below.
Description
The execute_python_code tool runs arbitrary Python code in a subprocess with restricted environment variables. The subprocess has access to all installed QWED SDK packages (qwed_new, qwed_legal, qwed_finance, qwed_ucp, etc.), so LLMs can write verification scripts that import and call any QWED engine directly.
The tool captures stdout and stderr from the subprocess and returns them as text.
Parameters
| Parameter | Type | Required | Default | Description |
|---|
code | string | Yes | - | Python code to execute in a subprocess |
background | boolean | No | false | When true, the job runs asynchronously in the background and returns a job_id immediately. Use the verification_status tool to poll for results. Recommended for heavy or long-running verification scripts. |
Risk gateway pre-validation
All tool calls pass through the RiskBasedExecutionGateway before dispatch. The gateway normalizes arguments, verifies code safety, and enforces server policy. If the gateway blocks a request, the tool returns a structured BLOCKED response with a verification_id and error code instead of executing the code. See governance error codes for the full list.
Environment
The subprocess runs with a restricted environment. Only PATH, PYTHONPATH, and SYSTEMROOT (Windows) are forwarded. Secrets, API keys, and other environment variables are stripped.
The server admin must set QWED_MCP_TRUSTED_CODE_EXECUTION=true to enable this tool. When disabled, the tool returns a BLOCKED_ADMIN_POLICY response even if the code passes safety verification.
Execution limits
| Limit | Synchronous (background=false) | Background (background=true) |
|---|
| Timeout | 30 seconds | No timeout (runs until completion) |
| Output cap | 1 MB (stdout and stderr each) | 1 MB (stdout and stderr each) |
| Process isolation | New subprocess per invocation | New subprocess per invocation |
| Concurrency | Sequential | Up to 5 concurrent background jobs |
When the 1 MB output cap is reached, the subprocess is terminated and the output is truncated with a warning message.
Examples
Verify a math calculation
{
"code": "from sympy import symbols, diff\nx = symbols('x')\nresult = diff(x**3, x)\nprint(f'derivative of x^3 = {result}')\nassert str(result) == '3*x**2', 'Mismatch!'\nprint('VERIFIED')"
}
Response:
STDOUT:
derivative of x^3 = 3*x**2
VERIFIED
Execution completed successfully.
Verify a financial calculation
{
"code": "from decimal import Decimal, ROUND_HALF_UP\n\nP = Decimal('10000')\nr = Decimal('0.075')\nn = Decimal('4')\nt = Decimal('5')\n\nA = P * (1 + r/n) ** (n*t)\nA = A.quantize(Decimal('0.01'), rounding=ROUND_HALF_UP)\nprint(f'Future value: ${A}')\nassert A == Decimal('14490.97'), f'Expected 14490.97, got {A}'"
}
Response:
STDOUT:
Future value: $14490.97
Execution completed successfully.
Check code for security vulnerabilities
{
"code": "from qwed_new.guards.code_guard import CodeGuard\n\nguard = CodeGuard()\nresult = guard.verify_safety(\"import os; os.system('rm -rf /')\")\nprint(f'Verified: {result[\"verified\"]}')\nif result.get('violations'):\n for v in result['violations']:\n print(f' - {v}')"
}
Response:
STDOUT:
Verified: False
- Dangerous pattern: os.system
Execution completed successfully.
Verify SQL safety
{
"code": "from qwed_new.guards.sql_guard import SQLGuard\n\nguard = SQLGuard()\nresult = guard.verify_query(\"SELECT * FROM users WHERE id = '1' OR '1'='1'\")\nprint(f'Verified: {result[\"verified\"]}')\nprint(f'Message: {result.get(\"message\", \"\")}')"
}
Verify a legal deadline
{
"code": "from qwed_legal import DeadlineGuard\n\nguard = DeadlineGuard(country='US')\nresult = guard.verify('2026-01-15', '30 business days', '2026-02-14')\nprint(f'Verified: {result.verified}')\nprint(f'Computed deadline: {result.computed_deadline}')"
}
Verify AI content provenance
{
"code": "import hashlib\nfrom qwed_legal import ProvenanceGuard\n\ncontent = 'This AI-generated memo reviews the indemnification terms.'\ncontent_hash = hashlib.sha256(content.encode()).hexdigest()\n\nguard = ProvenanceGuard(require_disclosure=True)\nresult = guard.verify_provenance(content, {\n 'content_hash': content_hash,\n 'model_id': 'claude-4.5-sonnet',\n 'generation_timestamp': '2026-03-24T12:00:00+00:00',\n})\nprint(f'Verified: {result[\"verified\"]}')\nprint(f'Checks passed: {result[\"checks_passed\"]}')"
}
Run a heavy verification in the background
{
"code": "from qwed_legal import LegalGuard\n\nguard = LegalGuard()\n# ... long-running multi-guard verification\nprint('All checks passed')",
"background": true
}
Response:
Verification order is being placed for the request 3f8a1b2c-... Check back using the 'verification_status' tool.
Then poll for results:
{
"name": "verification_status",
"arguments": {
"job_id": "3f8a1b2c-..."
}
}
Error responses
| Scenario | Response |
|---|
Missing or empty code | BLOCKED: Missing required non-empty 'code' argument. (verification_id=...) — error code QWED-MCP-RISK-003 |
Invalid background type | BLOCKED: 'background' must be a boolean when provided. (verification_id=...) — error code QWED-MCP-RISK-004 |
| Code fails safety check | BLOCKED: QWED blocked python execution: <issues> (verification_id=...) — error code QWED-MCP-RISK-005 |
| Tool disabled (admin policy) | BLOCKED_ADMIN_POLICY: Python execution was verified, but server policy keeps code execution disabled until QWED_MCP_TRUSTED_CODE_EXECUTION=true. (verification_id=...) — error code QWED-MCP-RISK-006 |
| Script raises exception | STDERR contains the traceback; return code is non-zero |
| Timeout exceeded | Execution timed out after 30.0 seconds. |
| Output cap exceeded | [WARNING: OUTPUT TRUNCATED DUE TO 1MB SIZE CAP. PROCESS TERMINATED.] — appended to the truncated output |
verification_status
Check the execution status and output of a background verification task dispatched via execute_python_code with background=true.
Parameters
| Parameter | Type | Required | Description |
|---|
job_id | string | Yes | The UUID returned by execute_python_code when background=true. Must be a valid canonical UUID format. |
The risk gateway validates job_id as a canonical UUID before dispatch. Non-UUID values are rejected with error code QWED-MCP-RISK-008.
The response text depends on the job state:
| Job state | Response |
|---|
queued | Status: queued... |
running | Status: running... |
success | Status: success\n\nResult:\n<stdout/stderr output> |
failed | Status: failed\n\nResult:\n<error details> |
cancelled | Status: cancelled\n\nResult:\nJob was cancelled. |
| Not found / expired | Error: Job ID '<id>' not found or expired. |
Example
{
"job_id": "3f8a1b2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c"
}
Response (while running):
Response (completed):
Status: success
Result:
STDOUT:
All checks passed
Execution completed successfully.
Job lifecycle
- Background jobs expire after 1 hour (3600 seconds). Expired jobs are pruned automatically.
- A maximum of 5 jobs can run concurrently. Additional jobs are queued until a slot opens.
- Once a job completes (
success, failed, or cancelled), its result is available until the TTL expires.
Background job state is held in memory on the MCP server. If the server restarts, all pending and completed jobs are lost.
Migrating from v0.1.x
In v0.1.x, QWED-MCP exposed individual tools (verify_math, verify_logic, verify_code, verify_sql, and others). In v0.2.0, all of these were consolidated into execute_python_code to reduce context bloat — instead of loading 14 tool schemas, the LLM loads one.
Before (v0.1.x)
User: "Verify the derivative of x³ equals 3x² using verify_math"
Claude: Calls verify_math tool with expression="x^3", claimed_result="3*x^2", operation="derivative"
After (v0.2.0)
User: "Write a script to verify the derivative of x³ using execute_python_code"
Claude: Calls execute_python_code with a Python script that imports sympy and checks the result
Use these QWED SDK imports in your execute_python_code scripts to replicate the previous tool behavior:
| Deprecated tool | Replacement SDK import |
|---|
verify_math | from sympy import ... or from qwed_new.engines.math_engine import verify_math_expression |
verify_logic | from qwed_new.engines.logic_engine import verify_logic_statement |
verify_code | from qwed_new.guards.code_guard import CodeGuard |
verify_sql | from qwed_new.guards.sql_guard import SQLGuard |
verify_banking_compliance | from qwed_finance import FinanceVerifier |
verify_iso_20022 | from qwed_finance import ISOGuard |
verify_commerce_transaction | from qwed_ucp import UCPVerifier |
verify_legal_deadline | from qwed_legal import DeadlineGuard |
verify_legal_citation | from qwed_legal import CitationGuard |
verify_legal_liability | from qwed_legal import LiabilityGuard |
verify_legal_jurisdiction | from qwed_legal import JurisdictionGuard |
verify_legal_statute | from qwed_legal import StatuteOfLimitationsGuard |
verify_system_command | from qwed_sdk.guards.system_guard import SystemGuard |
verify_file_path | from qwed_sdk.guards.system_guard import SystemGuard |
verify_config_secrets | from qwed_sdk.guards.config_guard import ConfigGuard |
If you see a BLOCKED response with error code QWED-MCP-RISK-001 in Claude Desktop, it means the LLM is trying to call a tool not in the governance policy table (likely a removed v0.1.x tool). Tell Claude: “The verify_* tools have been removed. Use execute_python_code to write and run a Python verification script.”
The following tools were available in v0.1.x and have been removed in v0.2.0. They are listed here for reference. Use execute_python_code with the corresponding SDK imports instead.
verify_math (deprecated)
Verified mathematical calculations using the SymPy symbolic mathematics engine.
| Parameter | Type | Required | Description |
|---|
expression | string | Yes | Mathematical expression (e.g., x^2, sin(x)) |
claimed_result | string | Yes | The result to verify |
operation | enum | No | One of: derivative, integral, simplify, solve, evaluate |
verify_logic (deprecated)
Verified logical arguments using the Z3 SMT solver.
| Parameter | Type | Required | Description |
|---|
premises | array[string] | Yes | List of premise statements |
conclusion | string | Yes | The conclusion to verify |
verify_code (deprecated)
Checked code for security vulnerabilities using AST analysis.
| Parameter | Type | Required | Description |
|---|
code | string | Yes | Code to analyze |
language | enum | Yes | One of: python, javascript, sql |
verify_sql (deprecated)
Detected SQL injection vulnerabilities and validated queries.
| Parameter | Type | Required | Description |
|---|
query | string | Yes | SQL query to verify |
allowed_tables | array[string] | No | Whitelist of allowed table names |
verify_banking_compliance (deprecated)
Verified banking logic using QWED Finance Guard.
| Parameter | Type | Required | Description |
|---|
scenario | string | Yes | Banking scenario description |
llm_output | string | Yes | The LLM’s reasoning to verify |
verify_commerce_transaction (deprecated)
Verified e-commerce transactions using QWED UCP.
| Parameter | Type | Required | Description |
|---|
cart_json | string | Yes | Cart/checkout state as JSON string |
verify_legal_deadline (deprecated)
Verified contract deadlines using LegalGuard.
| Parameter | Type | Required | Description |
|---|
signing_date | string | Yes | Date of signing (YYYY-MM-DD) |
term | string | Yes | Duration string |
claimed_deadline | string | Yes | Deadline date to verify |
verify_legal_citation (deprecated)
Verified legal citation format and validity.
| Parameter | Type | Required | Description |
|---|
citation | string | Yes | Legal citation string |
verify_legal_liability (deprecated)
Verified liability cap calculations.
| Parameter | Type | Required | Description |
|---|
contract_value | number | Yes | Total contract value |
cap_percentage | number | Yes | Cap percentage |
claimed_cap | number | Yes | Calculated cap amount |
verify_system_command (deprecated)
Verified shell commands for security risks.
| Parameter | Type | Required | Description |
|---|
command | string | Yes | Shell command to check |
verify_file_path (deprecated)
Verified file paths are within allowed sandbox directories.
| Parameter | Type | Required | Description |
|---|
filepath | string | Yes | Path to verify |
allowed_paths | array[string] | No | Whitelist of allowed dirs |
verify_config_secrets (deprecated)
Scanned configuration JSON for exposed secrets.
| Parameter | Type | Required | Description |
|---|
config_json | string | Yes | Configuration data as JSON string |
AIBOMGenerator (observability)
Generate an AI Bill of Materials (AI-BOM) manifest for visibility into your agent supply chain. This is useful for AI-SPM compliance auditing — tracking which models, verification engines, and MCP tools were used in a given session.
Description
The AIBOMGenerator produces a JSON manifest listing all components involved in an AI pipeline run. Each manifest includes a deterministic manifest_hash (SHA-256) so you can verify that two runs used the same component stack.
Usage
from qwed_mcp.observability.aibom import AIBOMGenerator
generator = AIBOMGenerator()
bom = generator.generate_manifest(
llm_model="gpt-4o",
qwed_engines_used=["qwed_tax.TaxVerifier", "qwed_legal.FairnessGuard"],
mcp_tools_used=["execute_python_code"]
)
print(bom["compliance"]) # "QWED_AI_SPM_v1"
print(bom["manifest_hash"]) # deterministic SHA-256 hash
print(bom["components"])
# {
# "models": [{"name": "gpt-4o", "type": "generator"}],
# "verification_engines": [
# {"name": "qwed_tax.TaxVerifier", "type": "qwed_deterministic"},
# {"name": "qwed_legal.FairnessGuard", "type": "qwed_deterministic"}
# ],
# "mcp_tools": [
# {"name": "execute_python_code", "type": "action_execution"}
# ]
# }
Parameters
| Parameter | Type | Required | Default | Description |
|---|
llm_model | str | Yes | - | Name of the LLM model used (e.g., "gpt-4o", "claude-3-opus") |
qwed_engines_used | list[str] | No | [] | QWED verification engines used in this pipeline |
mcp_tools_used | list[str] | No | [] | MCP tools invoked during this session |
Manifest fields
| Field | Type | Description |
|---|
timestamp | string | ISO 8601 UTC timestamp of generation |
components.models | array | LLM models used (type: "generator") |
components.verification_engines | array | QWED engines used (type: "qwed_deterministic") |
components.mcp_tools | array | MCP tools used (type: "action_execution") |
compliance | string | Always "QWED_AI_SPM_v1" |
manifest_hash | string | SHA-256 hash of the manifest (excluding timestamp) |
The manifest_hash is deterministic for identical inputs — the timestamp is excluded from the hash computation so that two manifests with the same components always produce the same hash.
SkillProvenanceGuard (security)
Verify MCP skill manifests before allowing dynamic tool loading. Protects against skill marketplace poisoning attacks where malicious agents upload trojanized skills to registries and inflate download counts.
Description
SkillProvenanceGuard performs deterministic provenance verification on skill manifests. When the QWED_SKILL_MANIFEST environment variable points to a JSON manifest file, the MCP server validates it at startup and refuses to start if verification fails.
You can also use SkillProvenanceGuard directly in your own code to vet skills before loading them.
Usage
from qwed_mcp.security import SkillProvenanceGuard
guard = SkillProvenanceGuard()
result = guard.verify_skill(manifest={
"name": "my-skill",
"version": "1.0.0",
"source_url": "https://github.com/org/skill",
"registry": "github.com",
"digest": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"download_count": 150,
})
print(result["verified"]) # True
print(result["status"]) # "TRUSTED"
print(result["risk_level"]) # "none"
Constructor parameters
trusted_registries
set[str] | None
default:"None"
Strict allowlist of registry domains. When set, only skills from these registries are accepted. When None, the default blocklist is used instead.
trusted_domains
set[str] | None
default:"None"
Additional source URL domains to trust, merged with the built-in trusted list (github.com, gitlab.com, bitbucket.org, pypi.org, npmjs.com, qwedai.com).
Whether to enforce cryptographic digest presence in the manifest.
Manifest fields
| Field | Type | Required | Description |
|---|
name | str | Yes | Skill name |
version | str | Yes | Skill version |
source_url | str | Yes | Source repository URL (must be HTTPS and from a trusted domain) |
registry | str | Yes | Registry domain the skill was loaded from |
digest | str | Yes (configurable) | Cryptographic digest in algorithm:hex_digest format. Supported algorithms: sha256, sha384, sha512 |
download_count | int | No | Number of downloads (used for anomaly detection) |
Verification checks
The guard runs five checks on every manifest:
| Check | Description |
|---|
| Required fields | name and version must be present and non-empty |
| Registry validation | Blocks known untrusted registries (clawdhub.com, moltbot.io, skillhub.ai, agentstore.dev, llm-tools.net). When trusted_registries is set, acts as an allowlist instead |
| Source URL validation | Domain must be in the trusted list; scheme must be HTTPS |
| Digest validation | Must follow algorithm:hex_digest format with a supported algorithm and correct hex length |
| Download anomaly detection | Flags counts below 10 (possibly planted) or counts divisible by 1000 (possible bot inflation) |
Additionally, manifest values (excluding metadata fields like name, description, source_url) are scanned for suspicious code patterns such as eval(), exec(), os.system(), and credential access attempts.
| Field | Type | Description |
|---|
verified | bool | True if all checks pass |
status | str | "TRUSTED" or "BLOCKED" |
skill_name | str | Name from the manifest |
risk_level | str | "none", "medium", or "high" |
findings | list[str] | All security findings |
message | str | Human-readable summary |
Server-level validation
When the QWED_SKILL_MANIFEST environment variable is set, the MCP server validates the manifest at startup:
QWED_SKILL_MANIFEST=/path/to/skill-manifest.json qwed-mcp
If validation fails, the server logs the error and exits immediately. This prevents poisoned skills from being loaded into the MCP pipeline.
Skills from registries like clawdhub.com and moltbot.io are blocked by default due to insufficient vetting. If you need to load skills from a custom registry, use the trusted_registries parameter to set an explicit allowlist.
RiskBasedExecutionGateway (governance)
Verification-first governance gateway that validates all MCP tool calls before dispatch. Every call to execute_python_code or verification_status passes through this gateway automatically.
Description
RiskBasedExecutionGateway enforces deterministic policy checks on every tool invocation. It normalizes arguments, runs code safety analysis, enforces admin policy, and returns a structured governance decision. If the gateway blocks a request, the MCP server returns the decision directly without executing the tool.
The gateway is instantiated automatically when the MCP server starts — you do not need to configure it separately.
How it works
- Policy lookup — The gateway checks the tool name against its internal policy table. Unknown tools are blocked immediately (
QWED-MCP-RISK-001).
- Argument validation — Required arguments are checked for type and presence. For
execute_python_code, the code parameter must be a non-empty string and background must be a boolean.
- Code safety verification — For
execute_python_code, the gateway runs AST-based analysis to detect dangerous patterns (e.g., eval, exec, compile, open, __import__, os.system, os.popen, subprocess, pickle.loads, marshal.loads). The analyzer resolves import aliases and from ... import renames to catch obfuscated calls such as import os as x; x.system(...) or from os import popen as op; op(...). Raw pattern-based checks are used as a fallback only when AST parsing fails.
- Admin policy enforcement — Even if code passes safety verification, the gateway checks
QWED_MCP_TRUSTED_CODE_EXECUTION. If not enabled, the request is blocked with BLOCKED_ADMIN_POLICY.
- UUID validation — For
verification_status, the job_id must be a valid canonical UUID.
Usage
The gateway is used internally by the MCP server. You can also use it directly in custom integrations:
from qwed_mcp.security import RiskBasedExecutionGateway
gateway = RiskBasedExecutionGateway()
decision = gateway.evaluate_and_route("execute_python_code", {
"code": "print('hello world')",
"background": False,
})
print(decision["verified"]) # True or False
print(decision["status"]) # "ALLOW_VERIFIED", "BLOCKED", or "BLOCKED_ADMIN_POLICY"
print(decision["verification_id"]) # Deterministic SHA-256 fingerprint
print(decision["normalized_arguments"]) # Cleaned arguments
Decision response fields
| Field | Type | Description |
|---|
verified | bool | True if the request is approved for execution |
status | str | ALLOW_VERIFIED, BLOCKED, or BLOCKED_ADMIN_POLICY |
risk_level | str | high or low depending on the tool |
verification_id | str | Deterministic SHA-256 fingerprint of the tool name and normalized arguments |
normalized_arguments | dict | Cleaned and validated arguments |
message | str | Human-readable explanation of the decision |
error_code | str | Present on blocked decisions. See governance error codes |
Governance error codes
| Error code | Tool | Trigger |
|---|
QWED-MCP-RISK-001 | Any | Unknown tool name not in the policy table |
QWED-MCP-RISK-002 | Any | Tool exists in policy but has no governance handler |
QWED-MCP-RISK-003 | execute_python_code | Missing or empty code argument |
QWED-MCP-RISK-004 | execute_python_code | background is not a boolean |
QWED-MCP-RISK-005 | execute_python_code | Code safety verification failed or raised an error |
QWED-MCP-RISK-006 | execute_python_code | Code passed verification but QWED_MCP_TRUSTED_CODE_EXECUTION is not true |
QWED-MCP-RISK-007 | verification_status | Missing or empty job_id argument |
QWED-MCP-RISK-008 | verification_status | job_id is not a valid canonical UUID |
The gateway defines built-in policies for each registered tool:
| Tool | Risk level | Requires verification |
|---|
execute_python_code | high | Yes |
verification_status | low | Yes |
Tools not in this table are blocked by default with QWED-MCP-RISK-001.
The verification_id is a deterministic SHA-256 hash of the tool name and normalized arguments. Two identical requests always produce the same verification_id, which you can use for auditing and deduplication.
Error handling
All tool responses include stdout, stderr, and a return code summary. A non-zero return code indicates the script raised an exception or exited with an error.
Common errors
| Error | Cause | Solution |
|---|
BLOCKED: Unknown MCP tool | Tool name not in the governance policy table | Use execute_python_code or verification_status |
BLOCKED: Missing required non-empty 'code' argument | Empty or missing code parameter | Provide Python code in the code field |
BLOCKED: QWED blocked python execution | Code contains dangerous patterns | Remove eval, exec, compile, open, __import__, os.system, os.popen, subprocess, pickle.loads, marshal.loads, or similar calls. Aliased imports are also detected. |
BLOCKED_ADMIN_POLICY | QWED_MCP_TRUSTED_CODE_EXECUTION not set | Server admin sets env var to true |
BLOCKED: Invalid job_id format | job_id is not a canonical UUID | Use the UUID returned by the background job |
Execution timed out after 30.0 seconds | Script exceeded synchronous time limit | Optimize the script, break into smaller steps, or use background=true |
ModuleNotFoundError in stderr | Missing QWED SDK package | Install the required package (e.g., pip install qwed-legal) |