> ## Documentation Index
> Fetch the complete documentation index at: https://docs.qwedai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Rate limits

> QWED API rate limiting by plan tier. Learn about Free, Pro, and Enterprise quotas, rate limit headers, and how to handle 429 Too Many Requests responses.

API rate limiting and quotas.

## Default limits

| Plan           | Requests/min | Requests/day | Batch size |
| -------------- | ------------ | ------------ | ---------- |
| **Free**       | 60           | 1,000        | 10         |
| **Pro**        | 600          | 50,000       | 50         |
| **Enterprise** | Unlimited    | Unlimited    | 100        |

## Rate limit headers

Every response includes rate limit headers:

```
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1703073600
```

| Header                  | Description             |
| ----------------------- | ----------------------- |
| `X-RateLimit-Limit`     | Max requests per window |
| `X-RateLimit-Remaining` | Requests remaining      |
| `X-RateLimit-Reset`     | Unix timestamp of reset |

## Rate limit response

When rate limited, you'll receive:

```json theme={null}
{
  "error": {
    "code": "QWED-005",
    "message": "Rate limit exceeded",
    "details": {
      "limit": 60,
      "reset_at": "2024-12-20T12:01:00Z",
      "retry_after": 45
    }
  }
}
```

HTTP Status: `429 Too Many Requests`

## Best practices

### 1. Implement exponential backoff

```python theme={null}
import time

def verify_with_retry(client, query, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.verify(query)
        except RateLimitError as e:
            wait = min(2 ** attempt, 60)
            time.sleep(wait)
    raise Exception("Max retries exceeded")
```

### 2. Use batch endpoints

Instead of individual requests:

```python theme={null}
# ❌ 10 requests
for item in items:
    client.verify(item)

# ✅ 1 request
client.verify_batch(items)
```

### 3. Cache results

```python theme={null}
import hashlib

cache = {}

def cached_verify(client, query):
    key = hashlib.sha256(query.encode()).hexdigest()
    if key in cache:
        return cache[key]
    result = client.verify(query)
    cache[key] = result
    return result
```

## Per-endpoint limits

Some endpoints have specific limits:

| Endpoint              | Limit                                               |
| --------------------- | --------------------------------------------------- |
| `/verify/batch`       | 100 items/request                                   |
| `/verify/consensus`   | Per-tenant rate limit (same as default plan limits) |
| `/agent/register`     | 10/hour                                             |
| `/attestation/verify` | 1000/hour                                           |

## Thread-safe in-memory limiter

The default in-memory rate limiter is thread-safe. All check-and-record operations (per-key and global) are protected by a lock, making it safe to use with multi-threaded ASGI servers such as Uvicorn with multiple workers. The `get_reset_time` method also operates under the lock to prevent stale reads.

## Fail-closed enforcement

When using Redis-backed rate limiting, the rate limiter operates with a **fail-closed** policy. If the Redis backend encounters an error at runtime, requests are denied rather than allowed through. This ensures that a temporary Redis outage does not silently bypass rate limits.

If Redis is unavailable at startup, an in-memory fallback is used until the service is restarted with a healthy Redis connection.

## Enterprise options

For higher limits, contact us for:

* Custom rate limits
* Dedicated infrastructure
* SLA guarantees
