The Math Engine is QWED’s core verification engine. It uses SymPy for symbolic computation to provide 100% accurate verification of mathematical claims.
Capabilities
| Category | Examples | Accuracy |
|---|
| Arithmetic | 2+2=4, 15*3=45 | 100% |
| Algebra | x^2 - 1 = (x-1)(x+1) | 100% |
| Calculus | Derivatives, Integrals, Limits | 100% |
| Trigonometry | sin(π/2) = 1, cos(0) = 1 | 100% |
| Logarithms | log(e) = 1, ln(e^x) = x | 100% |
| Financial | Compound interest, NPV, IRR | 100% |
| Statistics | Mean, std dev, percentiles | 100% |
Quick Start
from qwed_sdk import QWEDClient
client = QWEDClient(api_key="your_key")
# Verify a math claim
result = client.verify_math("15% of 200 is 30")
print(result.verified) # True
print(result.status) # "VERIFIED"
Core Operations
1. Expression Evaluation
Verify that an expression equals a value:
# Simple arithmetic
result = client.verify_math("2 * (5 + 10) = 30")
# ✓ Verified
# Complex expression
result = client.verify_math("sqrt(16) + 3^2 = 13")
# ✓ Verified
# Percentage
result = client.verify_math("15% of 200 = 30")
# ✓ Verified
2. Identity Verification
Check if two expressions are mathematically equivalent:
# Algebraic identity - TRUE
result = client.verify_math("(a+b)^2 = a^2 + 2*a*b + b^2")
# ✓ Verified: Algebraic identity proven
# Algebraic identity - FALSE
result = client.verify_math("(a+b)^2 = a^2 + b^2")
# ✗ Not Verified: Missing 2ab term
# Trig identity
result = client.verify_math("sin(x)^2 + cos(x)^2 = 1")
# ✓ Verified: Pythagorean identity
Identity verification uses symbolic simplification first. If SymPy proves the identity algebraically, the result is VERIFIED. When symbolic simplification is inconclusive, the engine samples five test points as a fallback. Only points that can be successfully evaluated count toward agreement — domain-restricted expressions (e.g., log(x) at x = -1) are skipped rather than causing a false negative.
Numerical sampling cannot prove equivalence. If all sample points agree but no formal proof was established, the engine now fails closed — returning BLOCKED with is_equivalent: false, method: "numerical_sampling_rejected", and confidence: 0.0. Two expressions can match at fixed points without being algebraically identical, so sampling-only agreement is rejected outright. Treat BLOCKED results as unverified.
3. Derivatives
Verify calculus derivatives:
result = client.verify_derivative(
expression="x^3 + 2*x^2",
variable="x",
expected="3*x^2 + 4*x"
)
# ✓ Verified
# Higher-order derivatives
result = client.verify_derivative(
expression="x^4",
variable="x",
expected="12*x^2",
order=2 # Second derivative
)
# ✓ Verified
4. Integrals
Verify indefinite and definite integrals:
# Indefinite integral
result = client.verify_integral(
expression="2*x",
variable="x",
expected="x^2" # + C implied
)
# ✓ Verified
# Definite integral
result = client.verify_integral(
expression="x^2",
variable="x",
lower=0,
upper=1,
expected="1/3"
)
# ✓ Verified
5. Limits
result = client.verify_limit(
expression="sin(x)/x",
variable="x",
point=0,
expected=1
)
# ✓ Verified: lim(x→0) sin(x)/x = 1
Financial Calculations
Compound Interest
result = client.verify_compound_interest(
principal=1000,
rate=0.05, # 5% annual
time=10, # years
n=12, # monthly compounding
expected=1647.01
)
# ✓ Verified
Net Present Value (NPV)
result = client.verify_npv(
rate=0.10,
cash_flows=[-1000, 300, 400, 500, 600],
expected=388.07
)
# ✓ Verified
Internal Rate of Return (IRR)
result = client.verify_irr(
cash_flows=[-1000, 400, 400, 400],
expected=0.0985 # ~9.85%
)
# ✓ Verified
Trust boundary
When you verify a natural language math query through the /verify/natural_language endpoint, the response includes a trust_boundary object that describes exactly what the pipeline proved and what it did not.
{
"status": "INCONCLUSIVE",
"final_answer": 30.0,
"trust_boundary": {
"query_interpretation_source": "llm_translation",
"query_semantics_verified": false,
"verification_scope": "translated_expression_only",
"deterministic_expression_evaluation": true,
"formal_proof": false,
"translation_claim_self_consistent": true,
"provider_used": "openai_compat",
"overall_status": "INCONCLUSIVE"
}
}
| Field | Meaning |
|---|
query_interpretation_source | How the user query was converted to an expression (always llm_translation) |
query_semantics_verified | Whether the translation accurately represents the user’s intent (false — this is not formally provable) |
verification_scope | What was actually verified (translated_expression_only) |
deterministic_expression_evaluation | Whether the expression itself was evaluated deterministically |
formal_proof | Whether a formal proof was established |
translation_claim_self_consistent | Whether the LLM’s claimed answer matches the computed answer |
overall_status | The response-level status reflecting the trust boundary |
The overall status for natural language math queries is INCONCLUSIVE because, while the expression evaluation is deterministic, QWED cannot verify that the LLM correctly interpreted the user’s intent. The trust_boundary gives you the information to decide whether the result is sufficient for your use case.
Error handling
When verification fails, QWED provides detailed error information:
result = client.verify_math("15% of 200 = 40")
if not result.verified:
print(result.error)
# "Calculation incorrect: 15% of 200 = 30, not 40"
print(result.expected)
# 30
print(result.actual)
# 40
Exact SymPy arithmetic
When SymPy is available, the math engine evaluates expressions using SymPy-native types (sympy.Integer, sympy.Float) instead of Python built-in int and float. This prevents floating-point drift during intermediate computation and ensures that comparisons between LLM answers and verified results use symbolic simplification rather than string matching alone.
Decimal precision
The math engine accepts Decimal values for exact arithmetic, which is especially useful for financial calculations:
from decimal import Decimal
result = client.verify_math(
expression="0.1 + 0.2",
expected_value=Decimal("0.3") # Exact comparison, no float drift
)
# ✓ Verified
When use_decimal=True (the default), the engine uses Decimal internally regardless of whether you pass a float or Decimal.
Tolerance settings
For floating-point comparisons, you can specify a tolerance:
result = client.verify_math(
"sqrt(2) = 1.41421",
tolerance=0.00001 # 5 decimal places
)
# ✓ Verified within tolerance
Tolerance bounding
To prevent inflated tolerances from masking incorrect results, the math engine enforces a deterministic upper bound on the tolerance parameter. The maximum allowed tolerance is computed as a function of the result’s magnitude:
max_tolerance = max(0.01, abs(calculated_value) * 0.01)
If the requested tolerance exceeds this bound, the verification is rejected with a BLOCKED status instead of returning a potentially misleading VERIFIED result. This applies to both decimal and float precision modes.
# Blocked — tolerance far exceeds the computed bound
result = client.verify_math("1 + 1", expected_value=999, tolerance=1000)
# result["status"] == "BLOCKED"
# result["error"] == "Tolerance exceeds deterministic verification bound"
# result["max_allowed_tolerance"] == "0.02000000"
# Allowed — tolerance is within bound for a large result
result = client.verify_math(
"10000 * (1 + 5/100)",
expected_value=10540,
tolerance=50,
)
# result["status"] == "VERIFIED"
The engine also rejects invalid tolerance values (negative numbers, NaN, Infinity, or non-numeric strings) with a BLOCKED status and an "Invalid tolerance" error message.
| Tolerance input | Behavior |
|---|
| Within computed bound | Normal verification proceeds |
| Exceeds computed bound | BLOCKED with max_allowed_tolerance in response |
Negative, NaN, or Infinity | BLOCKED with "Invalid tolerance" error |
| Non-numeric string | BLOCKED with "Invalid tolerance" error |
Trust boundary
When math verification runs through the natural language pipeline (POST /verify/natural_language), the response now includes a trust_boundary object. This object describes exactly what the pipeline proved and what it did not, separating deterministic expression evaluation from the non-deterministic LLM translation step.
{
"trust_boundary": {
"query_interpretation_source": "llm_translation",
"query_semantics_verified": false,
"verification_scope": "translated_expression_only",
"deterministic_expression_evaluation": true,
"formal_proof": false,
"translation_claim_self_consistent": true,
"provider_used": "openai",
"overall_status": "INCONCLUSIVE"
}
}
| Field | Type | Description |
|---|
query_interpretation_source | string | Always "llm_translation" — indicates the query was interpreted by an LLM |
query_semantics_verified | boolean | Always false — QWED cannot verify that the LLM correctly interpreted the user’s intent |
verification_scope | string | Always "translated_expression_only" — only the translated expression was verified |
deterministic_expression_evaluation | boolean | true when the inner engine status was VERIFIED or CORRECTION_NEEDED |
formal_proof | boolean | Always false — SymPy evaluation is deterministic but not a formal proof of the original query |
translation_claim_self_consistent | boolean | Whether the translated expression matched its own claimed answer |
provider_used | string | The LLM provider used for translation |
overall_status | string | The top-level response status after trust boundary alignment |
Because the LLM translation step is non-deterministic, the natural language pipeline now returns INCONCLUSIVE instead of VERIFIED even when the underlying expression evaluation succeeds. This prevents over-representing a translated-query evaluation as a proven user-query verdict. Use the direct POST /verify/math endpoint if you need a fully deterministic result without the LLM translation layer.
Ambiguous expressions
Expressions with implicit multiplication after division are ambiguous — for example, 1/2(3+1) could mean (1/2)*(3+1) or 1/(2*(3+1)). Rather than guessing, the math engine fails closed and returns BLOCKED:
result = client.verify_math("1/2(3+1)")
# result["is_valid"] == False
# result["status"] == "BLOCKED"
# result["warning"] == "ambiguous"
To resolve this, rewrite the expression with explicit parentheses or a * operator:
# Explicit grouping — no ambiguity
result = client.verify_math("(1/2)*(3+1)")
# ✓ Verified: 2.0
result = client.verify_math("1/(2*(3+1))")
# ✓ Verified: 0.125
Edge cases
| Scenario | Behavior |
|---|
| Division by zero | Returns error, not verified |
| Undefined expressions | Returns “UNDEFINED” status |
| Complex numbers | Fully supported |
| Very large numbers | Uses arbitrary precision |
| Symbolic variables | Verified algebraically |
| Oversized tolerance | Returns “BLOCKED” status with details |
| Invalid tolerance | Returns “BLOCKED” with error message |
| Sampling-only identity match | Returns “BLOCKED” — no formal proof established |
| Ambiguous implicit multiplication | Returns “BLOCKED” — rewrite with explicit operators |
| Operation | Avg Latency | Throughput |
|---|
| Simple arithmetic | 1.5ms | 690/sec |
| Complex expression | 5ms | 200/sec |
| Identity proof | 10ms | 100/sec |
Next Steps