QWED + Ollama Integration Guide
Use QWED for FREE with Local LLMs!
QWED supports ANY OpenAI-compatible API, including Ollama for running models locally.
Why Ollama + QWED?โ
โ
$0 Cost - No API fees, just electricity
โ
100% Private - Data never leaves your machine
โ
Full Control - Choose any model (Llama 3, Mistral, Phi, etc.)
โ
Fast - Local inference, no network latency
โ
Perfect for: Students, hobbyists, privacy-focused developers
Quick Start (5 Minutes)โ
Step 1: Install Ollamaโ
macOS/Linux:
curl -fsSL https://ollama.com/install.sh | sh
Windows: Download from https://ollama.com/download
Step 2: Pull a Modelโ
# Recommended: Llama 3 (8B)
ollama pull llama3
# Or other models:
ollama pull mistral
ollama pull phi3
ollama pull codellama
Step 3: Start Ollama Serverโ
ollama serve
# Server runs on http://localhost:11434
Step 4: Install QWEDโ
pip install qwed
Step 5: Use QWED with Ollama!โ
Option A: Backend Server (Recommended)
# terminal 1: Configure backend to use Ollama
cp .env.example .env
# Edit .env:
echo "ACTIVE_PROVIDER=openai" >> .env
echo "OPENAI_BASE_URL=http://localhost:11434/v1" >> .env
echo "OPENAI_API_KEY=ollama" >> .env
echo "OPENAI_MODEL=llama3" >> .env
# Start backend
python -m qwed_api
# terminal 2: Use QWED SDK
from qwed import QWEDClient
client = QWEDClient(
api_key="qwed_local",
base_url="http://localhost:8000"
)
result = client.verify("What is 2+2?")
print(result.verified) # True
print(result.value) # 4
Option B: QWEDLocal (Coming in v2.1.0)
from qwed import QWEDLocal
client = QWEDLocal(
base_url="http://localhost:11434/v1",
model="llama3",
api_key="ollama" # Dummy key
)
result = client.verify("Calculate factorial of 5")
print(result.verified) # True
print(result.value) # 120
Supported Modelsโ
QWED works with any Ollama model! Tested with:
| Model | Size | Best For | Speed |
|---|---|---|---|
| llama3 | 8B | General use, best accuracy | โกโกโก |
| mistral | 7B | Fast, good quality | โกโกโกโก |
| phi3 | 3.8B | Low memory, decent accuracy | โกโกโกโกโก |
| codellama | 7B | Code verification | โกโกโก |
| gemma | 7B | Google's model | โกโกโก |
Complete Exampleโ
from qwed import QWEDClient
client = QWEDClient(
api_key="qwed_local",
base_url="http://localhost:8000" # Your QWED backend
)
# Math verification
result = client.verify_math(
query="What is the derivative of x^2?",
llm_output="2x"
)
print(f"โ
Verified: {result.verified}")
print(f"๐ Evidence: {result.evidence}")
# Logic verification
result = client.verify_logic(
query="If A implies B, and B implies C, does A imply C?",
llm_output="Yes"
)
print(f"โ
Valid: {result.verified}")
# Code security
result = client.verify_code(
code='user_input = request.GET["q"]; eval(user_input)',
language="python"
)
print(f"๐จ Blocked: {result.blocked}")
print(f"โ ๏ธ Vulnerabilities: {result.vulnerabilities}")
Cost Comparisonโ
| Setup | Monthly Cost | Best For |
|---|---|---|
| Ollama (Local) | $0 ๐ | Students, hobbyists, privacy |
| OpenAI GPT-4o-mini | ~$5-10 | Startups, quick prototypes |
| Anthropic Claude | ~$20-50 | Production, best accuracy |
| OpenAI GPT-4 | ~$50-100 | Enterprises, critical systems |
With Ollama: 1 million verifications = $0 (just electricity!)
Hardware Requirementsโ
Minimum (Phi3, small models):
- 8GB RAM
- No GPU required (CPU only)
- Works on: M1 Mac, modern laptops
Recommended (Llama 3, Mistral):
- 16GB RAM
- GPU with 6GB+ VRAM (optional, speeds up inference)
- Works on: M1/M2 Mac, NVIDIA RTX 3060+
Ideal (Large models):
- 32GB+ RAM
- NVIDIA RTX 4090 / Apple M2 Ultra
- Can run: Llama 3 70B, CodeLlama 34B
Troubleshootingโ
Ollama not respondingโ
# Check Ollama is running
ollama list
# Restart Ollama
ollama serve
Connection refusedโ
# Verify Ollama endpoint
curl http://localhost:11434/api/tags
# Should return list of models
Slow inferenceโ
# Use smaller model
ollama pull phi3
# Or enable GPU acceleration (if available)
ollama run llama3 --gpu
Alternative Local LLM Toolsโ
QWED also works with:
- LM Studio - GUI for local models
- LocalAI - Drop-in OpenAI replacement
- text-generation-webui - Advanced UI
- vLLM - High-performance inference
All use OpenAI-compatible APIs โ work with QWED!
Privacy Benefitsโ
Data that NEVER leaves your machine:
- โ Prompts & queries
- โ LLM responses
- โ Verification results
- โ User information
Perfect for:
- ๐ฅ Healthcare (HIPAA compliance)
- ๐ฆ Finance (sensitive data)
- ๐๏ธ Government (classified info)
- ๐ฌ Research (confidential experiments)
Next Stepsโ
Expand your setup:
- Try different models:
ollama pull <model> - Fine-tune for your domain
- Deploy to production (Docker + Ollama)
Upgrade when needed:
- Start free with Ollama
- Switch to cloud APIs for scale
- QWED works with both seamlessly!
Communityโ
Questions?
- ๐ฌ Discussions: https://github.com/QWED-AI/qwed-verification/discussions
- ๐ Issues: https://github.com/QWED-AI/qwed-verification/issues
- ๐ Docs: https://docs.qwedai.com
Show your setup!
- Tweet with #QWED #Ollama
- Share your use case
- Help others get started
Remember: QWED is model agnostic. Start free with Ollama, scale to cloud when ready! ๐