Skip to main content

Overview

Introduced in v5.2.0. The DiagnosticResult model is an additive contract — no existing engine return types are changed. Engines migrate incrementally.
QWED verification engines historically returned ad-hoc Dict[str, Any] results with no consistent structure. Three incompatible VerificationResult dataclasses existed. Some engines returned (bool, str) tuples. Verification diagnostics were not separable by audience — agents saw internal detection patterns, developers couldn’t reliably find expected vs actual values, and auditors had no proof artifact references. v5.2.0 introduces a unified DiagnosticResult type with three disclosure layers, each targeted at a specific audience.

The 3-layer model

Layer 1 — Agent-Safe Diagnostics

Field: agent_message: str Agent/model-facing summary. Allows agents to correct failures without exposing verification internals. Allowed:
  • “Missing required field: customer_id”
  • “Verification failed — claim not supported”
  • “Could not deterministically verify”
Forbidden:
  • Detection signatures
  • Rule IDs
  • Internal regex patterns
  • Prompt injection indicators
  • Security bypass guidance
  • Verification implementation details

Layer 2 — Developer Diagnostics

Field: developer_fields: dict Application-developer-facing structured evidence. Includes constraint_id, expected/actual values, advisory_checks, methods_used, and engine-specific evidence.
developer_fields = {
    "constraint_id": "math_verifier.irr_non_convergent",
    "iterations": 100,
    "final_npv": "0.7",
    "converged": False,
    "advisory_checks": [
        {
            "name": "llm_fallback",
            "advisory_only": True,
            "constraint_id": "fact_verifier.llm_advisory_only",
            "details": {"llm_verdict": "SUPPORTED", "llm_confidence": 0.65},
        },
    ],
}

Layer 3 — Proof Diagnostics

Field: proof_ref: Optional[str] Cryptographic hash (sha256:...) of retained proof artifact. Present only when status == VERIFIED and proof was established. None for UNVERIFIABLE / BLOCKED. This is the authority bit. Downstream gates use a mechanical rule:
if result.proof_ref is not None:
    # Authoritative — admissible for control flow
    admit()
else:
    # Non-authoritative — reject for control flow
    block()

Status taxonomy

Three states only — no proliferation.
StatusMeaningproof_refControl flow
VERIFIEDClaim deterministically provenRequired (non-empty)May admit
UNVERIFIABLEClaim could not be provenNoneMust reject
BLOCKEDVerification could not be attemptedNoneMust reject
Richer distinctions (ambiguity, insufficient evidence, non-convergence, provider drift) live in developer_fields.constraint_id, not in status values. This keeps the taxonomy small while preserving diagnostic richness.
No HEURISTIC, AMBIGUOUS, or CORRECTION_NEEDED statuses. Ambiguity IS unverifiability — the distinction is structured in constraint_id, not in the status string.

Key invariants

VERIFIED requires proof

__post_init__ raises ValueError if status == VERIFIED and proof_ref is None or empty. “VERIFIED without proof” is impossible to construct — not a caller convention, a type-level invariant.
# This raises ValueError
DiagnosticResult(
    status=DiagnosticStatus.VERIFIED,
    agent_message="ok",
    developer_fields={},
    proof_ref=None,  # ← impossible
)

Non-VERIFIED rejects proof

The inverse is also enforced — UNVERIFIABLE and BLOCKED must have proof_ref = None. A non-pass state with a proof hash is a contract violation.

Frozen dataclasses

Both DiagnosticResult and AdvisoryCheck are frozen=True. Post-construction mutation of proof_ref or status is blocked — prevents bypassing the authority contract.
r = DiagnosticResult.unverifiable("no", {})
r.proof_ref = "sha256:fake"  # ← raises FrozenInstanceError

Advisory checks never influence verdicts

AdvisoryCheck represents non-proof-bearing analysis (LLM fallback, NLI entailment, VLM interpretation, heuristic consistency checks). It populates developer_fields.advisory_checks with advisory_only=True enforced via __post_init__. Advisory checks never set status or proof_ref. This structurally enforces the constraint: diagnostics must never originate from model reasoning, confidence, or self-assessment.

Usage

Constructing results

from src.qwed_new.core.diagnostics import DiagnosticResult, DiagnosticStatus, AdvisoryCheck

# VERIFIED with proof
result = DiagnosticResult.verified(
    agent_message="Claim verified — unique mode match",
    developer_fields={
        "statistic": "mode",
        "calculated_value": 1,
        "claimed_value": 1,
        "modes": [1],
        "modes_count": 1,
        "constraint_id": "math_verifier.mode_unique_match",
    },
    evidence={"calculated": 1, "claimed": 1, "modes": [1]},
)
# result.proof_ref = "sha256:abc123..."
# result.is_authoritative = True

# UNVERIFIABLE — cannot prove
result = DiagnosticResult.unverifiable(
    agent_message="Statistical claim inconclusive — dataset has multiple modes",
    developer_fields={
        "modes": [1, 2],
        "modes_count": 2,
        "tie_detected": True,
        "constraint_id": "math_verifier.mode_ambiguous_tie",
    },
)
# result.proof_ref = None
# result.is_authoritative = False

# BLOCKED — cannot attempt
result = DiagnosticResult.blocked(
    agent_message="Logic verification blocked — variable declarations missing",
    developer_fields={
        "missing_declarations": ["x"],
        "constraint_id": "logic_verifier.explicit_declarations_required",
    },
)
# result.proof_ref = None

Advisory checks

# LLM fallback as advisory — never verdict-deciding
result = DiagnosticResult.unverifiable(
    agent_message="Claim could not be deterministically verified",
    developer_fields={
        "deterministic_verdict": "INSUFFICIENT_EVIDENCE",
        "advisory_checks": [
            AdvisoryCheck(
                name="llm_fallback",
                advisory_only=True,
                constraint_id="fact_verifier.llm_advisory_only",
                details={"llm_verdict": "SUPPORTED", "llm_confidence": 0.65},
            ),
        ],
    },
)

Downstream gate

def release_gate(result: DiagnosticResult) -> bool:
    """Mechanical authority check — proof_ref presence decides."""
    if not result.is_authoritative:
        log_block(result.constraint_id, result.agent_message)
        return False
    return True

Serialization

# to_dict — JSON-safe, AdvisoryCheck instances serialized
d = result.to_dict()
# {"status": "VERIFIED", "agent_message": "...", "developer_fields": {...}, "proof_ref": "sha256:...", "is_authoritative": True}

# from_dict — deserialize with validation
result = DiagnosticResult.from_dict(d)

Migrating legacy engines

from_legacy_dict() converts ad-hoc engine dicts to DiagnosticResult for fail-closed states:
# Legacy engine returns dict
legacy = {"is_correct": False, "status": "CORRECTION_NEEDED", "calculated_value": 3}

# Migrate
result = DiagnosticResult.from_legacy_dict(legacy, engine="math")
# DiagnosticResult(status=UNVERIFIABLE, proof_ref=None, ...)
from_legacy_dict raises for legacy VERIFIED results — proof artifacts were discarded by pre-v5.2.0 engines, so backfilling is impossible. Use DiagnosticResult.verified() with explicit evidence for true verified results.

API reference

DiagnosticStatus

class DiagnosticStatus(str, Enum):
    VERIFIED = "VERIFIED"
    UNVERIFIABLE = "UNVERIFIABLE"
    BLOCKED = "BLOCKED"

DiagnosticResult

FieldTypeDescription
statusDiagnosticStatusTri-state verdict
agent_messagestrLayer 1 — agent-safe summary
developer_fieldsdictLayer 2 — structured evidence
proof_refOptional[str]Layer 3 — sha256 hash of proof artifact
PropertyTypeDescription
is_verifiedboolTrue only when status is VERIFIED
is_authoritativeboolTrue when proof_ref is not None (authority bit)
is_fail_closedboolTrue when status is UNVERIFIABLE or BLOCKED
constraint_idOptional[str]Primary constraint identifier from developer_fields
advisory_checksList[AdvisoryCheck]Deserialized advisory checks
MethodDescription
to_dict()Serialize to JSON-safe dict
from_dict(data)Deserialize from dict
verified(agent_message, developer_fields, evidence)Construct VERIFIED with computed proof_ref
unverifiable(agent_message, developer_fields)Construct UNVERIFIABLE
blocked(agent_message, developer_fields)Construct BLOCKED
from_legacy_dict(data, engine)Migrate ad-hoc engine dict

AdvisoryCheck

FieldTypeDescription
namestrCheck name (e.g. “llm_fallback”)
advisory_onlyboolAlways True — structurally enforced
constraint_idOptional[str]Constraint identifier
detailsdictCheck-specific details

compute_proof_ref(evidence)

def compute_proof_ref(evidence: Dict[str, Any]) -> str:
    """Deterministic sha256 hash of JSON-serialized evidence."""
    # Returns "sha256:abcdef..."
Evidence must be JSON-serializable. Non-serializable values raise ValueError (fail-closed) — callers must pre-convert.

Constraints

Non-negotiable constraints (per #204):
  1. Diagnostics are NOT explainability — no confidence scores, no chain-of-thought, no model reasoning
  2. All diagnostic fields must originate from verification results, constraints, rule evaluation, schema validation, or proof systems
  3. Agent-safe diagnostics must never expose detection logic, rule IDs, regex patterns, or security bypass guidance
  4. Existing fail-closed behavior must not be weakened

Migration path

The DiagnosticResult contract is additive. Existing engines continue to work with their ad-hoc return types. Migration is incremental:
  1. v5.2.0 (this release) — contract established, 83 tests
  2. Engine conformance — each engine adopts DiagnosticResult in a separate PR
  3. Full migration — ad-hoc dicts and VerificationResult dataclasses replaced

Engines being migrated

EngineIssueDiagnostic focus
MathVerifier (mode)#129Layer 2 + 3
MathVerifier (eigenvalues)#130Layer 2 + 3
MathVerifier (IRR)#131Layer 3
FactVerifier (LLM fallback)#133Layer 1 + 2 + 3
ImageVerifier (VLM fallback)#134Layer 1 + 3
LogicVerifier (symbol inference)#162Layer 2 + 3
GraphFactVerifier (NLI)#163Layer 2 + 3
ReasoningVerifier (no proof)#164Layer 3
FactVerifier (provider drift)#190Layer 3
SecureCodeExecutor#205Layer 1 + 2

Attestations

Cryptographic proof artifacts and JWT signing for verification results.

Architecture overview

High-level QWED architecture and verification lifecycle.

Determinism guarantee

How QWED enforces deterministic verification outcomes.

Compliance

Audit trails, SOC 2, and GDPR compliance documentation.