> ## Documentation Index
> Fetch the complete documentation index at: https://docs.qwedai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Why cloud LLMs for QWED verification?

> Why cloud LLMs like GPT-4 and Claude outperform local models for QWED verification, with accuracy comparisons and production guidance.

**TL;DR:** Verification is a critical task requiring maximum accuracy. Cloud LLMs (GPT-4, Claude) deliver 90-95% accuracy vs 70-80% for local models, making them ideal for production verification.

***

## 🎯 The core problem

**Verification requires two things:**

1. **LLM translates the query** (natural language → structured reasoning)
2. **Symbolic verifier proves the answer** (SymPy, Z3, AST)

If the LLM gets step 1 wrong, verification fails even with perfect symbolic math.

***

## 📊 LLM accuracy comparison

### Math verification example

**Query:** "What is the integral of 2x?"

| Model              | Type  | Accuracy | Typical Response                              |
| ------------------ | ----- | -------- | --------------------------------------------- |
| **GPT-4o-mini**    | Cloud | \~95%    | "x² + C" ✅                                    |
| **Claude 3 Haiku** | Cloud | \~93%    | "x² + C" ✅                                    |
| **Llama 3 8B**     | Local | \~75%    | Sometimes "x² + C" ✅, sometimes "2x²/2" ❌     |
| **Mistral 7B**     | Local | \~70%    | Inconsistent, may confuse derivative/integral |

### Why this matters

**When QWED verifies:**

```
1. LLM says: "x² + C"
2. SymPy computes: integrate(2*x, x) = x**2
3. QWED compares: ✅ MATCH!
```

**If LLM is wrong:**

```
1. LLM says: "2x²" (incorrect)
2. SymPy computes: x**2
3. QWED: ❌ NO MATCH → Verification fails
```

**Result:** User sees failure, even though QWED's symbolic engine is correct!

***

## 🤔 When to use each

| Use Case                   | Local LLM (Ollama)         | Cloud LLM (OpenAI/Anthropic) |
| -------------------------- | -------------------------- | ---------------------------- |
| **Development/Testing**    | ✅ Free, fast iteration     | ⚠️ Costs add up              |
| **Production (Critical)**  | ❌ Lower accuracy           | ✅ **Recommended**            |
| **Privacy-Sensitive Data** | ✅ 100% local + PII masking | ⚠️ Use with PII masking      |
| **Cost-Sensitive**         | ✅ \$0/month                | ⚠️ \~\$5-50/month            |
| **High-Stakes Decisions**  | ❌ Risk of errors           | ✅ **Recommended**            |

***

## 💡 QWED's hybrid approach

**Best Practice: Use both strategically**

### Development setup (free)

```python theme={null}
from qwed_sdk import QWEDLocal

# Local LLM for development
client_dev = QWEDLocal(
    base_url="http://localhost:11434/v1",  # Ollama
    model="llama3",
    cache=True  # Cache responses
)

# Test your queries
result = client_dev.verify("What is 2+2?")
```

**Cost:** \$0/month\
**Use for:** Prototyping, experimentation, learning

### Production setup (reliable)

```python theme={null}
import os
from qwed_sdk import QWEDLocal

# Cloud LLM for production
client_prod = QWEDLocal(
    provider="openai",
    api_key=os.getenv("OPENAI_API_KEY"),
    model="gpt-4o-mini",
    mask_pii=True,   # Privacy protection
    cache=True        # 50-80% cost savings!
)

# Critical verification
result = client_prod.verify("Verify calculation: ...")
```

**Cost:** \~\$5-10/month (with caching!)\
**Use for:** Production, high-stakes decisions

***

## 💰 Cost analysis

### Local LLM (Ollama)

* **Setup:** 10 minutes (download model)
* **Monthly Cost:** \$0
* **Accuracy:** 70-80% on math/logic
* **Privacy:** 100% local
* **Best for:** Development, testing, learning

### Cloud LLM (OpenAI GPT-4o-mini)

* **Setup:** 2 minutes (API key)
* **Monthly Cost:** \$5-10 (with caching)
* **Accuracy:** 90-95% on math/logic
* **Privacy:** Use PII masking
* **Best for:** Production, critical tasks

### With QWED caching (cost savings)

```python theme={null}
# First query: Hits LLM (costs $$)
result1 = client.verify("What is 2+2?")

# Same query within 24 hours: Cache hit (FREE!)
result2 = client.verify("What is 2+2?")  # $0 cost!
```

**Savings:** repeated queries hit the cache and skip the LLM call.

***

## 🔒 Privacy considerations

### Local LLM advantages

✅ **Private** — data stays on your machine\
✅ **No API keys** - no third-party access\
✅ **Compliance** - easier GDPR/HIPAA compliance

### Cloud LLM with PII masking

```python theme={null}
client = QWEDLocal(
    provider="openai",
    mask_pii=True,  # Auto-mask emails, SSNs, etc.
    pii_entities=["EMAIL_ADDRESS", "CREDIT_CARD", "US_SSN"]
)

# Sensitive data protected!
result = client.verify("User email: john@example.com, calculate 2+2")
# OpenAI sees: "User email: <EMAIL_ADDRESS>, calculate 2+2"
```

**Result:** Cloud accuracy + local privacy! 🔒

***

## 🎯 Recommendation by use case

### Healthcare (HIPAA)

```python theme={null}
# Option 1: Local LLM (most private)
client = QWEDLocal(
    base_url="http://localhost:11434/v1",
    model="llama3"
)

# Option 2: Cloud + PII masking (more accurate)
client = QWEDLocal(
    provider="openai",
    mask_pii=True,
    pii_entities=["PERSON", "US_SSN", "MEDICAL_LICENSE"]
)
```

**Recommendation:** Cloud + PII masking for critical diagnoses

### Finance (PCI-DSS)

```python theme={null}
client = QWEDLocal(
    provider="openai",
    mask_pii=True,
    pii_entities=["CREDIT_CARD", "IBAN_CODE"]
)
```

**Recommendation:** Cloud + PII masking (accuracy matters for money!)

### Enterprise (general)

```python theme={null}
# Development
dev_client = QWEDLocal(base_url="http://localhost:11434/v1", model="llama3")

# Production
prod_client = QWEDLocal(provider="openai", mask_pii=True, cache=True)
```

**Recommendation:** Hybrid approach

***

## 🚀 The QWED advantage

**Even with local LLMs, QWED catches errors!**

### Scenario: local LLM makes mistake

```python theme={null}
client = QWEDLocal(base_url="http://localhost:11434/v1", model="llama3")

# Llama 3 might say: "Derivative of x² is x" (WRONG!)
result = client.verify("What is the derivative of x²?")

# QWED's symbolic verification:
# SymPy: diff(x**2, x) = 2*x
# LLM said: "x"
# QWED: ❌ NO MATCH! 
# result.verified = False
```

**User sees:** "Verification failed - LLM answer doesn't match symbolic proof"

**But:**

* More failures = worse UX
* Cloud LLMs = fewer verification failures = better UX

***

## 📈 Accuracy in practice

**From QWED internal testing:**

| Domain        | Local LLM (Llama 3 8B) | Cloud LLM (GPT-4o-mini) |
| ------------- | ---------------------- | ----------------------- |
| Basic Math    | 85%                    | 98%                     |
| Calculus      | 75%                    | 95%                     |
| Logic (SAT)   | 70%                    | 93%                     |
| Code Security | 80%                    | 96%                     |

**Takeaway:** Cloud LLMs reduce verification failures by 15-25%!

***

## 🎓 Bottom line

### Start with local LLM

```bash theme={null}
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Download model
ollama pull llama3

# Use with QWED
python -c "from qwed_sdk import QWEDLocal; \
  client = QWEDLocal(base_url='http://localhost:11434/v1', model='llama3'); \
  print(client.verify('2+2'))"
```

**Perfect for:** Learning, prototyping, hobby projects

### Scale to cloud LLM

```bash theme={null}
# Get API key from OpenAI
export OPENAI_API_KEY="sk-..."

# Use with QWED
python -c "from qwed_sdk import QWEDLocal; \
  client = QWEDLocal(provider='openai', mask_pii=True, cache=True); \
  print(client.verify('2+2'))"
```

**Perfect for:** Production, enterprise, critical decisions

***

## 🔗 Related documentation

* **[LLM configuration guide](/getting-started/llm-configuration)** — complete LLM setup
* **[PII masking guide](/advanced/pii-masking)** — privacy protection
* **[Caching guide](/advanced/qwed-local)** — cost savings

***

## ❓ FAQ

**Q: Can I use Llama 3 70B instead of GPT-4?**\
A: Yes! Larger local models (70B+) approach cloud accuracy but require significant hardware (40GB+ VRAM).

**Q: Is Ollama really free?**\
A: Yes! Fully open source. You just need hardware to run it.

**Q: What about Google Gemini?**\
A: QWED supports Gemini! Similar accuracy to GPT-4/Claude.

**Q: Can I switch between local and cloud?**\
A: Absolutely! Change the `provider` parameter anytime.

**Q: Do I need PII masking with local LLMs?**\
A: Not necessarily, but it's still good practice for audit trails.

***

**The choice is yours - QWED works with both!** 🚀

**Recommendation:** Start local (free), scale to cloud (reliable) when it matters.
