> ## Documentation Index
> Fetch the complete documentation index at: https://docs.qwedai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Stats engine

> QWED's Stats Engine executes statistical queries on tabular data using the secure Docker sandbox. In-process fallbacks are disabled.

The Stats Engine executes statistical queries on tabular data. All model-generated code runs inside a secure Docker sandbox — in-process execution paths (Wasm and restricted Python) are disabled.

## Features

* **Docker sandbox** — Full container isolation for all statistical code execution
* **Fail-closed execution** — If Docker is unavailable, verification is blocked rather than falling back to in-process execution
* **Pre-execution security validation** — AST-based code analysis before Docker execution
* **Live Docker health checks** — The executor verifies Docker availability on each request, not just at startup

## Prerequisites

The Stats Engine requires a running Docker daemon. Without Docker, all statistical verification requests return HTTP 503. See the [deployment guide](/advanced/deployment) for setup instructions.

## Usage

```python theme={null}
import pandas as pd
from qwed_sdk import QWEDClient

client = QWEDClient(api_key="qwed_...")

# Create sample data
df = pd.DataFrame({
    "product": ["A", "B", "C"],
    "sales": [100, 200, 150]
})

# Verify statistical claim
result = client.verify_stats(
    query="What is the average sales?",
    data=df
)
print(result.answer)  # 150.0
```

## Execution model

All generated statistical code is executed inside a Docker container with enforced memory and CPU limits. The engine does not fall back to in-process execution under any circumstances.

| Scenario                                 | Behavior                                                      |
| ---------------------------------------- | ------------------------------------------------------------- |
| Docker running                           | Code executes in an isolated container                        |
| Docker unavailable at startup            | Requests return `503 Service Temporarily Unavailable`         |
| Docker becomes unavailable mid-operation | Request is blocked and returns `503`                          |
| Code fails AST security check            | Request returns `403 Verification Blocked by Security Policy` |
| Code generation fails                    | Request returns `Internal verification error`                 |

<Warning>
  Previous versions of QWED offered Wasm and restricted Python fallbacks when Docker was unavailable. These fallback paths have been removed. You must have a running Docker daemon for statistical verification to work.
</Warning>

## Error handling

When the Stats Engine encounters an internal failure — such as a code generation or translation error — it returns a generic `"Internal verification error"` message. Sensitive details like file paths, credentials, or stack traces are never included in the API response.

If you receive this error, check the server-side logs for diagnostic details. The engine logs the exception type for debugging while keeping the client response opaque.

## Direct operations

For simple operations, bypass code generation:

```python theme={null}
result = client.compute_statistics(
    data=df,
    column="sales",
    operation="mean"  # mean, median, std, var, sum, count, min, max, mode
)
```

| Operation | Description                               |
| --------- | ----------------------------------------- |
| `mean`    | Arithmetic mean of the column             |
| `median`  | Median value                              |
| `std`     | Standard deviation                        |
| `var`     | Variance                                  |
| `sum`     | Sum of all values                         |
| `count`   | Number of non-NaN values                  |
| `min`     | Minimum value                             |
| `max`     | Maximum value                             |
| `mode`    | Most frequent value (fails if multimodal) |

### Fail-closed validation

`compute_statistics` returns `SUCCESS` only when the result is clearly defined and safely verifiable. It returns `ERROR` in the following cases:

| Condition                                                | Error                                                         |
| -------------------------------------------------------- | ------------------------------------------------------------- |
| Column not found                                         | `Column '{name}' not found`                                   |
| Unknown operation                                        | `Unknown operation '{name}'`                                  |
| Multiple modes (multimodal data)                         | `mode is ambiguous because {n} equally frequent values exist` |
| Mode with no values                                      | `mode produced an undefined result (NaN)`                     |
| Result is NaN (includes empty series or all-NaN columns) | `{operation} produced an undefined result (NaN)`              |

Empty series and all-NaN columns are caught by the NaN result check — if the underlying pandas operation returns `NaN`, the method returns an `ERROR` status rather than propagating the undefined value.

```python theme={null}
import pandas as pd

# Empty series — returns ERROR (NaN result)
df_empty = pd.DataFrame({"col": pd.Series([], dtype="float64")})
result = client.compute_statistics(data=df_empty, column="col", operation="mean")
print(result["status"])  # ERROR

# Multimodal data — returns ERROR for mode
df_multi = pd.DataFrame({"col": [1, 1, 2, 2]})
result = client.compute_statistics(data=df_multi, column="col", operation="mode")
print(result["status"])  # ERROR
```
