Skip to main content
The Reasoning Engine validates LLM reasoning traces using chain-of-thought parsing and multi-provider verification.

Features

  • Chain-of-Thought Validation - Parse and verify reasoning steps
  • Result Caching - LRU + Redis for repeated queries
  • Multi-Provider Support - Anthropic, Azure OpenAI, Google Gemini, OpenAI
  • Semantic Fact Extraction - Identify verifiable claims

Usage

from qwed_sdk import QWEDClient

client = QWEDClient(api_key="qwed_...")

result = client.verify_reasoning(
    query="If all cats are mammals, and all mammals are animals, are all cats animals?",
    enable_caching=True
)

print(result.is_valid)     # True
print(result.confidence)   # 0.95
print(result.reasoning_trace)  # Step-by-step logic
print(result.cached)       # True/False

Multi-provider verification

result = client.verify_reasoning(
    query="Complex reasoning task...",
    providers=["anthropic", "openai", "gemini", "azure"],
    enable_cross_validation=True
)

print(result.provider_agreement)  # 4/4
print(result.per_provider_results)

Caching

The engine caches results to avoid redundant LLM calls:
# First call - hits LLM
result1 = client.verify_reasoning(query, enable_caching=True)
print(result1.cached)  # False

# Second call - from cache
result2 = client.verify_reasoning(query, enable_caching=True)
print(result2.cached)  # True (instant!)
Each ReasoningVerifier instance owns its own cache, so verifiers configured with different providers or modes do not share entries. Cache keys incorporate the query, primary formula, the provider list (in configured order), and the cross-validation flag, and cached results are returned as defensive copies so callers cannot mutate stored entries.

Cache key composition

A cache entry is only reused when every component below matches the current verification call. Changing any of these forces a fresh computation:
query
string
The verification query text, byte-for-byte.
primary formula
string
The expression of the primary task derived from the query.
providers
list[string]
The configured provider list, in the order it was supplied. Provider precedence is part of cache identity: reordering providers (for example, swapping the primary) invalidates the cache, as do additions or removals.
enable_cross_validation
bool
Whether cross-validation is requested. Toggling this flag invalidates cache reuse, because cross-validated and single-provider results are not interchangeable.

TTL enforcement

Cached entries are bound to their creation time and are evaluated against cache_ttl_seconds on every lookup. When an entry’s age exceeds the TTL, it is evicted and the verifier recomputes the result. Stale entries are never returned, even if they remain within the LRU window.
from qwed_sdk import QWEDClient

# 5-second TTL for demonstration
client = QWEDClient(api_key="qwed_...", cache_ttl_seconds=5)

result1 = client.verify_reasoning(query="...", enable_caching=True)
print(result1.cached)  # False

# Within TTL - cached
result2 = client.verify_reasoning(query="...", enable_caching=True)
print(result2.cached)  # True

# After TTL expiry - recomputed
import time; time.sleep(6)
result3 = client.verify_reasoning(query="...", enable_caching=True)
print(result3.cached)  # False

Instance isolation

The cache is per-instance state, not shared across ReasoningVerifier objects. Two verifiers — even with identical configuration — start with independent caches and never read or write each other’s entries. This prevents stale results from one verification context replaying under a different one.

Fail-closed prerequisites

The engine fails closed when required reasoning evidence is missing. A verification result is only marked valid when both prerequisites below are satisfied.
reasoning trace
required
A substantive reasoning trace must be produced by the primary provider. The verifier scans each numbered or bulleted trace entry and rejects entries that contain any of the following non-substantive markers (case-insensitive):
  • no llm provider
  • could not generate reasoning trace
  • no structured reasoning trace generated
  • failed to generate reasoning trace
  • n/a
  • unavailable
  • no reasoning
  • rate limit exceeded
Indented trace entries are still accepted as long as they begin with a digit or - after stripping leading whitespace. If no substantive entries remain, the result includes the issue Reasoning trace unavailable or non-substantive and is_valid is False.
distinct secondary provider
required when enable_cross_validation=True
Cross-validation requires a secondary provider that is different from the primary. The primary provider is no longer reused as a fallback.If enable_cross_validation=True and no distinct secondary provider is configured, the result includes the issue Cross-validation requested but no distinct secondary provider is available and is_valid is False.
# Fails closed: only one provider configured but cross-validation requested
result = client.verify_reasoning(
    query="Complex reasoning task...",
    providers=["anthropic"],
    enable_cross_validation=True,
)

print(result.is_valid)  # False
print(result.issues)
# ["Cross-validation requested but no distinct secondary provider is available"]

Formula equivalence

When verifying whether two arithmetic formulas produce the same result, the engine uses a safe AST-based evaluator instead of Python’s eval(). The evaluator only allows basic arithmetic operators (+, -, *, /, **) and numeric literals — any other expression is rejected. This prevents code-injection risks while still supporting numeric fallback checks during reasoning validation.

Fail-closed prerequisites

The Reasoning Engine refuses to mark a result as valid when the evidence required to verify it is missing. Absence of issues is not proof — if the engine cannot produce a substantive reasoning trace, or cannot actually run cross-validation when requested, the result is reported as invalid with an explanatory issue.

Required reasoning trace

Verification fails closed when no usable reasoning trace can be produced. This occurs when:
  • No LLM provider is available to generate the trace.
  • The provider returns an empty or placeholder response.
  • Trace generation raises an error or hits a provider rate limit.
  • The trace contains only non-substantive markers such as N/A, unavailable, no reasoning, rate limit exceeded, or messages indicating the trace could not be generated.
The parser accepts both flush-left and indented trace lines — any line whose stripped content begins with a digit (for example, 1. ...) or - is treated as a reasoning step. Lines that match a non-substantive marker are not counted as substantive reasoning, even if they are formatted as a numbered list. In any of these cases, result.is_valid is False and result.issues contains either Reasoning trace missing or Reasoning trace unavailable or non-substantive.
result = client.verify_reasoning(
    query="Alice has 10 apples and gets 5 more. How many apples does she have?",
    providers=["anthropic"],
)

if not result.is_valid:
    print(result.issues)
    # ["Reasoning trace unavailable or non-substantive"]
To resolve this, configure at least one reachable provider with valid credentials before calling verify_reasoning.

Distinct secondary provider for cross-validation

When enable_cross_validation=True, the engine requires a secondary provider that is distinct from the primary. The primary provider is no longer reused as a fallback secondary path, because verifying a model against itself does not constitute cross-validation. If only one provider is configured, cross-validation fails closed with the issue Cross-validation requested but no distinct secondary provider is available.
# Fails closed - only one provider configured
result = client.verify_reasoning(
    query="Complex reasoning task...",
    providers=["anthropic"],
    enable_cross_validation=True,
)
print(result.is_valid)  # False
print(result.issues)
# ["Cross-validation requested but no distinct secondary provider is available"]

# Passes prerequisite - two distinct providers
result = client.verify_reasoning(
    query="Complex reasoning task...",
    providers=["anthropic", "openai"],
    enable_cross_validation=True,
)
To run cross-validation, configure at least two distinct providers, or set enable_cross_validation=False to skip the secondary check.

Chain-of-thought validation

cot_trace = """
Step 1: All cats are mammals (given)
Step 2: All mammals are animals (given)
Step 3: Therefore, all cats are animals (transitivity)
"""

result = client.validate_cot(
    trace=cot_trace,
    conclusion="All cats are animals"
)

print(result.valid_steps)    # [1, 2, 3]
print(result.invalid_steps)  # []
print(result.conclusion_valid)  # True