> ## Documentation Index
> Fetch the complete documentation index at: https://docs.qwedai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# QWED verification: specification guide

> How QWED verifies LLM outputs against developer-written specifications, and the golden rule for reliable verification.

> **TL;DR:** QWED verifies LLM outputs against **developer-written code specifications**, not natural language. If you let the LLM generate both the answer AND the spec, you've just verified a hallucination.

## How QWED actually works

```
┌─────────────────────────────────────────────────────────────┐
│                    CORRECT USAGE                            │
├─────────────────────────────────────────────────────────────┤
│  Developer provides:  expected_value = "100000 * 1.05**10"  │
│  LLM generates:       answer = "150000"                     │
│  QWED verifies:       150000 ≠ 162889.46 → REJECTED         │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                    INCORRECT USAGE                          │
├─────────────────────────────────────────────────────────────┤
│  LLM generates:       expected_value = "100000 * 1.05**5"   │
│  LLM generates:       answer = "127628"                     │
│  QWED verifies:       127628 ≈ 127628.16 → VERIFIED ❌      │
│  (Hallucination verified because spec was also hallucinated)│
└─────────────────────────────────────────────────────────────┘
```

## The golden rule

| Component         | Who Provides It | Example                       |
| ----------------- | --------------- | ----------------------------- |
| **Specification** | Developer (you) | `expected = "P * (1 + r)**n"` |
| **LLM Output**    | LLM             | `"$150,000"`                  |
| **Ground Truth**  | Developer (you) | `P=100000, r=0.05, n=10`      |

**QWED does NOT solve:** Natural language → Formal specification translation.

**QWED DOES solve:** "Is this LLM output correct given my known ground truth?"

***

## Examples by engine

### Math engine

```python theme={null}
from qwed_new.core.verifier import VerificationEngine

engine = VerificationEngine()

# ✅ CORRECT: Developer provides formula
result = engine.verify_compound_interest(
    principal=100000,      # Developer knows this
    rate=0.05,             # Developer knows this  
    time=10,               # Developer knows this
    n=1,                   # Developer knows this
    expected=150000        # LLM claimed this
)
# Result: is_correct=False, calculated=162889.46

# ❌ WRONG: Letting LLM provide the formula
# If LLM says "use rate=0.03" and you trust it, QWED can't help
```

### SQL engine

```python theme={null}
from qwed_new.core.sql_verifier import SQLVerifier

verifier = SQLVerifier()

# ✅ CORRECT: Developer defines allowed tables
result = verifier.verify_query(
    query="SELECT * FROM users",           # LLM generated
    allowed_tables=["users", "orders"],    # Developer defines
    allowed_columns=["id", "name", "email"] # Developer defines
)

# ❌ WRONG: Letting LLM define what tables are allowed
```

### Code engine

```python theme={null}
from qwed_new.core.code_verifier import CodeSecurityVerifier

verifier = CodeSecurityVerifier()

# ✅ CORRECT: Developer defines forbidden patterns
result = verifier.analyze_code(
    code="eval(user_input)",  # LLM generated
    # Verifier has built-in dangerous pattern detection
)
# Result: UNSAFE - eval detected

# The patterns (eval, exec, __import__) are defined by QWED, not LLM
```

### Logic engine

```python theme={null}
from qwed_new.core.logic_verifier import LogicVerifier

verifier = LogicVerifier()

# ✅ CORRECT: Developer provides premises
result = verifier.verify_conclusion(
    premises=["All humans are mortal", "Socrates is human"],  # Developer
    conclusion="Socrates is mortal"  # LLM claimed
)

# ❌ WRONG: Letting LLM provide premises
# If LLM says "All humans are immortal", QWED will verify wrong logic
```

***

## When QWED does not work

| Use Case                                 | Why It Fails                       |
| ---------------------------------------- | ---------------------------------- |
| "Verify this essay is factually correct" | No ground truth to compare against |
| "Check if this code does what I want"    | "What you want" is ambiguous       |
| "Validate this creative writing"         | No deterministic correctness       |
| "Verify this translation is accurate"    | Requires semantic understanding    |

***

## When QWED works best

| Use Case               | Why It Works                               |
| ---------------------- | ------------------------------------------ |
| Financial calculations | Formula is known, answer must match        |
| SQL query validation   | Schema is known, query must follow rules   |
| Code security scanning | Dangerous patterns are predefined          |
| Logic proofs           | Premises are given, conclusion must follow |
| Statistical claims     | Data is known, statistics must be correct  |

***

## Summary

| Question                           | Answer                                            |
| ---------------------------------- | ------------------------------------------------- |
| Does QWED use LLMs to verify LLMs? | **No.** Uses SymPy, Z3, AST, SQLGlot.             |
| Can QWED verify any LLM output?    | **No.** Only structured, domain-specific outputs. |
| Who provides the ground truth?     | **You, the developer.**                           |
| What if the spec is wrong?         | **Output will be wrong.** Same as any software.   |

***

*"Math verifies the proof, not the premise. If you hallucinate the constraints, you've verified a hallucination."*

— Valid criticism we agree with. That's why specs come from you, not the LLM.
