Stats engine - QWED Documentation

The Stats Engine executes statistical queries on tabular data. All model-generated code runs inside a secure Docker sandbox — in-process execution paths (Wasm and restricted Python) are disabled.

Features

Docker sandbox — Full container isolation for all statistical code execution
Fail-closed execution — If Docker is unavailable, verification is blocked rather than falling back to in-process execution
Pre-execution security validation — AST-based code analysis before Docker execution
Live Docker health checks — The executor verifies Docker availability on each request, not just at startup

Prerequisites

The Stats Engine requires a running Docker daemon. Without Docker, all statistical verification requests return HTTP 503. See the deployment guide for setup instructions.

Usage

import pandas as pd
from qwed_sdk import QWEDClient

client = QWEDClient(api_key="qwed_...")

# Create sample data
df = pd.DataFrame({
    "product": ["A", "B", "C"],
    "sales": [100, 200, 150]
})

# Verify statistical claim
result = client.verify_stats(
    query="What is the average sales?",
    data=df
)
print(result.answer)  # 150.0

Execution model

All generated statistical code is executed inside a Docker container with enforced memory and CPU limits. The engine does not fall back to in-process execution under any circumstances.

Scenario	Behavior
Docker running	Code executes in an isolated container
Docker unavailable at startup	Requests return `503 Service Temporarily Unavailable`
Docker becomes unavailable mid-operation	Request is blocked and returns `503`
Code fails AST security check	Request returns `403 Verification Blocked by Security Policy`
Code generation fails	Request returns `Internal verification error`

Previous versions of QWED offered Wasm and restricted Python fallbacks when Docker was unavailable. These fallback paths have been removed. You must have a running Docker daemon for statistical verification to work.

Error handling

When the Stats Engine encounters an internal failure — such as a code generation or translation error — it returns a generic "Internal verification error" message. Sensitive details like file paths, credentials, or stack traces are never included in the API response. If you receive this error, check the server-side logs for diagnostic details. The engine logs the exception type for debugging while keeping the client response opaque.

Direct operations

For simple operations, bypass code generation:

result = client.compute_statistics(
    data=df,
    column="sales",
    operation="mean"  # mean, median, std, var, sum, count, min, max, mode
)

Operation	Description
`mean`	Arithmetic mean of the column
`median`	Median value
`std`	Standard deviation
`var`	Variance
`sum`	Sum of all values
`count`	Number of non-NaN values
`min`	Minimum value
`max`	Maximum value
`mode`	Most frequent value (fails if multimodal)

Fail-closed validation

compute_statistics returns SUCCESS only when the result is clearly defined and safely verifiable. It returns ERROR in the following cases:

Condition	Error
Column not found	`Column '{name}' not found`
Unknown operation	`Unknown operation '{name}'`
Multiple modes (multimodal data)	`mode is ambiguous because {n} equally frequent values exist`
Mode with no values	`mode produced an undefined result (NaN)`
Result is NaN (includes empty series or all-NaN columns)	`{operation} produced an undefined result (NaN)`

Empty series and all-NaN columns are caught by the NaN result check — if the underlying pandas operation returns NaN, the method returns an ERROR status rather than propagating the undefined value.

import pandas as pd

# Empty series — returns ERROR (NaN result)
df_empty = pd.DataFrame({"col": pd.Series([], dtype="float64")})
result = client.compute_statistics(data=df_empty, column="col", operation="mean")
print(result["status"])  # ERROR

# Multimodal data — returns ERROR for mode
df_multi = pd.DataFrame({"col": [1, 1, 2, 2]})
result = client.compute_statistics(data=df_multi, column="col", operation="mode")
print(result["status"])  # ERROR

​Features

​Prerequisites

​Usage

​Execution model

​Error handling

​Direct operations

​Fail-closed validation

Features

Prerequisites

Usage

Execution model

Error handling

Direct operations

Fail-closed validation