The Stats Engine executes statistical queries on tabular data. All model-generated code runs inside a secure Docker sandbox — in-process execution paths (Wasm and restricted Python) are disabled.
Features
- Docker sandbox — Full container isolation for all statistical code execution
- Fail-closed execution — If Docker is unavailable, verification is blocked rather than falling back to in-process execution
- Pre-execution security validation — AST-based code analysis before Docker execution
- Live Docker health checks — The executor verifies Docker availability on each request, not just at startup
Prerequisites
The Stats Engine requires a running Docker daemon. Without Docker, all statistical verification requests return HTTP 503. See the deployment guide for setup instructions.
Usage
import pandas as pd
from qwed_sdk import QWEDClient
client = QWEDClient(api_key="qwed_...")
# Create sample data
df = pd.DataFrame({
"product": ["A", "B", "C"],
"sales": [100, 200, 150]
})
# Verify statistical claim
result = client.verify_stats(
query="What is the average sales?",
data=df
)
print(result.answer) # 150.0
Execution model
All generated statistical code is executed inside a Docker container with enforced memory and CPU limits. The engine does not fall back to in-process execution under any circumstances.
| Scenario | Behavior |
|---|
| Docker running | Code executes in an isolated container |
| Docker unavailable at startup | Requests return 503 Service Temporarily Unavailable |
| Docker becomes unavailable mid-operation | Request is blocked and returns 503 |
| Code fails AST security check | Request returns 403 Verification Blocked by Security Policy |
| Code generation fails | Request returns Internal verification error |
Previous versions of QWED offered Wasm and restricted Python fallbacks when Docker was unavailable. These fallback paths have been removed. You must have a running Docker daemon for statistical verification to work.
Error handling
When the Stats Engine encounters an internal failure — such as a code generation or translation error — it returns a generic "Internal verification error" message. Sensitive details like file paths, credentials, or stack traces are never included in the API response.
If you receive this error, check the server-side logs for diagnostic details. The engine logs the exception type for debugging while keeping the client response opaque.
Direct operations
For simple operations, bypass code generation:
result = client.compute_statistics(
data=df,
column="sales",
operation="mean" # mean, median, std, var, sum, count, min, max, mode
)
| Operation | Description |
|---|
mean | Arithmetic mean of the column |
median | Median value |
std | Standard deviation |
var | Variance |
sum | Sum of all values |
count | Number of non-NaN values |
min | Minimum value |
max | Maximum value |
mode | Most frequent value (fails if multimodal) |
Fail-closed validation
compute_statistics returns SUCCESS only when the result is clearly defined and safely verifiable. It returns ERROR in the following cases:
| Condition | Error |
|---|
| Column not found | Column '{name}' not found |
| Unknown operation | Unknown operation '{name}' |
| Multiple modes (multimodal data) | mode is ambiguous because {n} equally frequent values exist |
| Mode with no values | mode produced an undefined result (NaN) |
| Result is NaN (includes empty series or all-NaN columns) | {operation} produced an undefined result (NaN) |
Empty series and all-NaN columns are caught by the NaN result check — if the underlying pandas operation returns NaN, the method returns an ERROR status rather than propagating the undefined value.
import pandas as pd
# Empty series — returns ERROR (NaN result)
df_empty = pd.DataFrame({"col": pd.Series([], dtype="float64")})
result = client.compute_statistics(data=df_empty, column="col", operation="mean")
print(result["status"]) # ERROR
# Multimodal data — returns ERROR for mode
df_multi = pd.DataFrame({"col": [1, 1, 2, 2]})
result = client.compute_statistics(data=df_multi, column="col", operation="mode")
print(result["status"]) # ERROR