> ## Documentation Index
> Fetch the complete documentation index at: https://docs.qwedai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Determinism guarantee

> How QWED classifies verification engines by determinism level: fully symbolic, hybrid with LLM fallback, and heuristic, and what to trust in production.

> **TL;DR:** QWED uses **deterministic solvers** (SymPy, Z3, AST, SQLGlot) wherever possible. When LLM fallback is required, the response is explicitly marked as `HEURISTIC`.

***

## Engine classification

Every QWED verification engine falls into one of three categories:

| Category          | Engines                               | Technology                      | Reproducible?  |
| ----------------- | ------------------------------------- | ------------------------------- | -------------- |
| **100% Symbolic** | Math, Logic, Code, SQL, Schema, Taint | SymPy, Z3, AST, SQLGlot         | ✅ Yes          |
| **Hybrid**        | Fact, Stats                           | TF-IDF / Sandbox → LLM fallback | ⚠️ Conditional |
| **Heuristic**     | Image, Consensus, Reasoning           | VLM / Multi-LLM voting          | ❌ No           |

### Detailed breakdown

| Engine        | Mode      | Primary Technology          | LLM Fallback?                         |
| ------------- | --------- | --------------------------- | ------------------------------------- |
| **Math**      | Symbolic  | SymPy (symbolic algebra)    | ❌ Never                               |
| **Logic**     | Symbolic  | Z3 Theorem Prover (SMT)     | ❌ Never                               |
| **Code**      | Symbolic  | Python AST + Bandit         | ❌ Never                               |
| **SQL**       | Symbolic  | SQLGlot AST parser          | ❌ Never                               |
| **Schema**    | Symbolic  | Pydantic + SymPy            | ❌ Never                               |
| **Taint**     | Symbolic  | Data Flow Analysis (AST)    | ❌ Never                               |
| **Fact**      | Hybrid    | TF-IDF similarity           | ✅ When TF-IDF confidence \< threshold |
| **Stats**     | Hybrid    | Wasm/Docker sandbox         | ✅ When sandbox execution fails        |
| **Image**     | Heuristic | Vision LLM (GPT-4V, Claude) | ✅ Always                              |
| **Consensus** | Heuristic | Multi-LLM voting            | ✅ Always                              |
| **Reasoning** | Heuristic | Chain-of-thought LLM        | ✅ Always                              |

***

## How to know which mode was used

Every API response includes a `verification_mode` field:

```json theme={null}
{
  "status": "VERIFIED",
  "verified": true,
  "engine": "math",
  "verification_mode": "SYMBOLIC",
  "result": { ... }
}
```

### `verification_mode` values

| Value       | Meaning                                              | Trust Level                        |
| ----------- | ---------------------------------------------------- | ---------------------------------- |
| `SYMBOLIC`  | Result proven by deterministic solver (SymPy/Z3/AST) | 🟢 **100% reproducible**           |
| `HEURISTIC` | Result from LLM fallback or multi-model voting       | 🟡 **Best-effort, not guaranteed** |

***

## Why this matters

### For regulated industries

Banks, healthcare, and legal systems require **audit trails** and **reproducibility**:

* ✅ `SYMBOLIC` results can be independently verified
* ⚠️ `HEURISTIC` results should be flagged for human review

### For production systems

```python theme={null}
result = client.verify_math("2+2=4")

if result.verification_mode == "SYMBOLIC":
    # Safe to use without human review
    process_result(result)
elif result.verification_mode == "HEURISTIC":
    # Flag for human review or additional validation
    queue_for_review(result)
```

***

## Trust boundaries in translated queries

When a natural language query is verified through the math pipeline, the response status is `INCONCLUSIVE` — even when the expression evaluation itself is fully deterministic. This is because the LLM translation step (natural language → expression) is not formally verified.

Each response includes a `trust_boundary` object that makes this explicit:

```python theme={null}
result = client.verify("What is 15% of 200?")

print(result.status)  # "INCONCLUSIVE"
print(result.trust_boundary["deterministic_expression_evaluation"])  # True
print(result.trust_boundary["query_semantics_verified"])  # False
```

Use the `trust_boundary` fields to decide how to handle the result in your application. For example, you might accept the computed answer while logging that the translation was not formally proven. See [Math Engine — Trust boundary](/engines/math#trust-boundary) for the full field reference.

***

## Design philosophy

> **"Probabilistic systems should not be trusted with deterministic tasks."**

QWED treats LLMs as **untrusted translators**:

1. LLM converts natural language → formal specification
2. **Deterministic solver** verifies the formal specification
3. If solver cannot verify, result is marked `HEURISTIC`
4. If the translation itself cannot be formally verified, result is marked `INCONCLUSIVE` with a `trust_boundary`

```
┌─────────────────────────────────────────────────────────┐
│                    QWED Protocol                        │
├─────────────────────────────────────────────────────────┤
│  User Query → LLM (Translator) → DSL → Solver → Result │
│                                                         │
│  Solver = SymPy/Z3/AST  →  SYMBOLIC                     │
│  Solver = LLM Fallback  →  HEURISTIC                    │
│  Translation unverified  →  INCONCLUSIVE + trust_boundary│
└─────────────────────────────────────────────────────────┘
```

***

## Trust boundaries in the natural language pipeline

When a query enters through `POST /verify/natural_language`, an LLM first translates the user's natural language into a formal expression. Even when the downstream solver (e.g., SymPy) evaluates that expression deterministically, the overall result depends on whether the LLM correctly interpreted the user's intent — and that step is **not** deterministic.

To make this distinction explicit, the natural language math pipeline now:

1. Returns `INCONCLUSIVE` as the top-level status instead of `VERIFIED`, even when the expression evaluation succeeds.
2. Includes a `trust_boundary` object in the response that describes exactly what was and was not proven.

```json theme={null}
{
  "trust_boundary": {
    "query_interpretation_source": "llm_translation",
    "query_semantics_verified": false,
    "verification_scope": "translated_expression_only",
    "deterministic_expression_evaluation": true,
    "formal_proof": false,
    "translation_claim_self_consistent": true,
    "provider_used": "openai",
    "overall_status": "INCONCLUSIVE"
  }
}
```

This means a `VERIFIED` result from the Math engine's direct endpoint (`POST /verify/math`) carries stronger guarantees than the same calculation routed through the natural language pipeline. See the [Math engine](/engines/math#trust-boundary) documentation for the full field reference.

### Numerical sampling fallback

The identity verification fallback (numerical sampling at fixed test points) now **fails closed** — returning `BLOCKED` with `is_equivalent: false` and `confidence: 0.0`. Previously this returned `UNKNOWN` with `is_equivalent: null` (and before that, `LIKELY_EQUIVALENT` with `confidence: 0.99`). Matching at a handful of sample points does not constitute a formal proof, so the engine now rejects the result outright rather than leaving the outcome ambiguous.

***

## Frequently asked questions

### Q: Can I filter responses by verification\_mode?

**A:** Yes! Use the `require_symbolic` option in your request:

```python theme={null}
result = client.verify(
    query="Calculate NPV",
    options={"require_symbolic": True}
)
# Fails if solver cannot verify deterministically
```

### Q: Why is Fact verification sometimes HEURISTIC?

**A:** Fact verification uses TF-IDF (deterministic) to find evidence. If TF-IDF confidence is below threshold, it falls back to NLI (LLM-based), marking the result as `HEURISTIC`.

### Q: Is Consensus useful if it's HEURISTIC?

**A:** Yes! Consensus detects **disagreement** between multiple LLMs. While not deterministic, it catches cases where models are uncertain — useful as a safety net, not a formal verifier.

***

## See also

* [Engines overview](/engines/overview) — Full engine documentation
* [API endpoints](/api/endpoints) — Response schema reference
* [Whitepaper](/whitepaper) — Academic justification for neurosymbolic approach