Skip to main content

Symbolic Execution Limits in QWED

TL;DR: CrossHair symbolic execution has real-world limitations. QWED addresses these with bounded model checking, depth limits, and graceful fallbacks.

The Path Explosion Problem

Symbolic execution explores all possible execution paths. This works great for small code, but real-world code has:

ChallengeExampleImpact
Deep loopsfor i in range(1000)1000+ paths to explore
Recursionfibonacci(n)Exponential path growth
Complex conditionalsNested if/else chains2^n paths
Data structuresDict/list operationsUnbounded state space

When CrossHair Works

Code TypeWorks Well?Example
Pure functions✅ Yesdef add(a, b): return a + b
Simple validation✅ Yesdef is_positive(x): return x > 0
Bounded loops✅ Yesfor i in range(10)
LLM-generated utilities✅ YesMost generated code is simple

When CrossHair Fails

Code TypeWorks?Why
Deep recursion (n > 50)❌ NoPath explosion
Unbounded loops❌ NoInfinite paths
External I/O❌ NoNon-deterministic
Complex frameworks❌ NoToo much state

QWED's Bounded Model Checking Solution

We implemented depth limits to prevent path explosion:

from qwed_new.core.symbolic_verifier import SymbolicVerifier

verifier = SymbolicVerifier(
max_loop_iterations=100, # Bound loops
max_recursion_depth=50, # Bound recursion
timeout_seconds=30 # Hard timeout
)

result = verifier.verify_code(code)

if result["status"] == "TIMEOUT":
# Fallback to static analysis
print("Symbolic execution timed out, using AST analysis")

Configuration Guide

Conservative (Fast, Less Coverage)

config = {
"max_loop_iterations": 10,
"max_recursion_depth": 10,
"timeout_seconds": 5
}

Balanced (Default)

config = {
"max_loop_iterations": 100,
"max_recursion_depth": 50,
"timeout_seconds": 30
}

Thorough (Slow, More Coverage)

config = {
"max_loop_iterations": 1000,
"max_recursion_depth": 100,
"timeout_seconds": 300
}

Fallback Strategy

When symbolic execution fails, QWED falls back to:

1. Symbolic Execution (CrossHair)
↓ timeout/failure
2. Static Analysis (AST)
↓ insufficient
3. Type Checking (mypy patterns)
↓ still need more
4. Manual Review Flag

Honest Benchmarks

We tested CrossHair on different code types:

Code TypeLinesLoopsResultTime
Simple math50✅ Verified0.2s
String utils151✅ Verified1.2s
Data validation303✅ Verified5.8s
Complex parser10010⏱️ Timeout30s
Framework code500+Many❌ Not suitable-

Best Practices

DO Use Symbolic Execution For:

  • ✅ LLM-generated utility functions
  • ✅ Mathematical calculations
  • ✅ Validation logic
  • ✅ Simple transformations
  • ✅ Code with clear contracts

DON'T Use Symbolic Execution For:

  • ❌ Entire applications
  • ❌ Code with external dependencies
  • ❌ Deep recursion algorithms
  • ❌ Real-time systems
  • ❌ Code with I/O operations

API Reference

from qwed_new.core.symbolic_verifier import SymbolicVerifier

# Create verifier with custom limits
verifier = SymbolicVerifier(
max_loop_iterations=100,
max_recursion_depth=50,
timeout_seconds=30
)

# Check if code is verifiable before executing
budget = verifier.get_verification_budget(code)
if budget["estimated_time_seconds"] > 60:
print("Code too complex for symbolic execution")

# Verify with fallback
result = verifier.verify_code(
code=code,
fallback_to_ast=True # Use AST if symbolic fails
)

Summary

QuestionAnswer
Does symbolic execution work on all code?No. Limited by path explosion.
How does QWED handle this?Bounded model checking + fallbacks.
What's the typical code size limit?~50-100 lines of simple code.
When should I use it?LLM-generated utilities and validation.

"My guess would be that technique falls apart at depths required in real world coding environments."

— Reddit criticism. We agree. That's why we have bounds and fallbacks.