Rate limits
Limits are applied per token, not per IP address, because AI agents tend to burst and people often share a network.
The limits
Section titled “The limits”| Surface | Per minute | Per hour |
|---|---|---|
| Most endpoints | 120 | 3,000 |
Search (/v1/search) | 30 | 600 |
These are generous for interactive use. They exist to stop a runaway loop, not to throttle normal work.
Handling a 429
Section titled “Handling a 429”When you exceed a limit, the API responds with HTTP 429 Too Many Requests and
a Retry-After header telling you how many seconds to wait. Respect it:
async function withRetry(makeRequest, maxRetries = 5) { for (let attempt = 0; attempt <= maxRetries; attempt++) { const res = await makeRequest(); if (res.status !== 429) return res; const wait = Number(res.headers.get('Retry-After') ?? 5); await new Promise((r) => setTimeout(r, wait * 1000)); } throw new Error('Still rate-limited after the maximum number of retries');}