Rate limits

Limits are applied per token, not per IP address, because AI agents tend to burst and people often share a network.

The limits

Surface	Per minute	Per hour
Most endpoints	120	3,000
Search (`/v1/search`)	30	600

These are generous for interactive use. They exist to stop a runaway loop, not to throttle normal work.

Handling a 429

When you exceed a limit, the API responds with HTTP 429 Too Many Requests and a Retry-After header telling you how many seconds to wait. Respect it:

async function withRetry(makeRequest, maxRetries = 5) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const res = await makeRequest();
    if (res.status !== 429) return res;
    const wait = Number(res.headers.get('Retry-After') ?? 5);
    await new Promise((r) => setTimeout(r, wait * 1000));
  }
  throw new Error('Still rate-limited after the maximum number of retries');
}