Why Ollama + QWED?
- No per-call cost — no API fees, just local compute.
- Private — data stays on your machine.
- Model choice — Llama 3, Mistral, Phi, and others.
- Local inference — no network latency.
- Use cases — local development, prototyping, privacy-sensitive workloads.
Quick start (5 minutes)
Step 1: install Ollama
macOS/Linux:Step 2: pull a model
Step 3: start the Ollama server
Step 4: install QWED
Step 5: use QWED with Ollama
Option A: Backend Server (Recommended)Supported models
QWED works with any Ollama model! Tested with:| Model | Size | Best For | Speed |
|---|---|---|---|
| llama3 | 8B | General use, best accuracy | ⚡⚡⚡ |
| mistral | 7B | Fast, good quality | ⚡⚡⚡⚡ |
| phi3 | 3.8B | Low memory, decent accuracy | ⚡⚡⚡⚡⚡ |
| codellama | 7B | Code verification | ⚡⚡⚡ |
| gemma | 7B | Google’s model | ⚡⚡⚡ |
Complete example
Cost comparison
| Setup | Monthly Cost | Best For |
|---|---|---|
| Ollama (Local) | $0 💚 | Students, hobbyists, privacy |
| OpenAI GPT-4o-mini | ~$5-10 | Startups, quick prototypes |
| Anthropic Claude | ~$20-50 | Production, best accuracy |
| OpenAI GPT-4 | ~$50-100 | Enterprises, critical systems |
Hardware requirements
Minimum (Phi3, small models):- 8GB RAM
- No GPU required (CPU only)
- Works on: M1 Mac, modern laptops
- 16GB RAM
- GPU with 6GB+ VRAM (optional, speeds up inference)
- Works on: M1/M2 Mac, NVIDIA RTX 3060+
- 32GB+ RAM
- NVIDIA RTX 4090 / Apple M2 Ultra
- Can run: Llama 3 70B, CodeLlama 34B
Troubleshooting
Ollama not responding
Connection refused
Slow inference
Alternative local LLM tools
QWED also works with:- LM Studio - GUI for local models
- LocalAI - Drop-in OpenAI replacement
- text-generation-webui - web UI for local models
- vLLM - High-performance inference
Privacy benefits
Data that NEVER leaves your machine:- ✅ Prompts & queries
- ✅ LLM responses
- ✅ Verification results
- ✅ User information
- 🏥 Healthcare (HIPAA compliance)
- 🏦 Finance (sensitive data)
- 🏛️ Government (classified info)
- 🔬 Research (confidential experiments)
Next steps
Expand your setup:- Try different models:
ollama pull <model> - Fine-tune for your domain
- Deploy to production (Docker + Ollama)
- Start free with Ollama
- Switch to cloud APIs for scale
- QWED works with both Ollama and cloud APIs.
Community
Questions?- 💬 Discussions: https://github.com/QWED-AI/qwed-verification/discussions
- 🐛 Issues: https://github.com/QWED-AI/qwed-verification/issues
- 📖 Docs: https://docs.qwedai.com
- Tweet with #QWED #Ollama
- Share your use case
- Help others get started
QWED is model-agnostic: it works with Ollama and with hosted providers.