The limit is a generous safety ceiling set well above normal integration traffic, and it may be tuned over time. It is not published as a fixed number — read your current allowance from the
X-RateLimit-Limit response header rather than hard-coding a value.How the limit works
The limiter is a token bucket: it holds up to one minute of requests and refills continuously, so short bursts above your average rate succeed as long as the bucket has tokens. You only approach the limit under sustained, high-volume traffic — a runaway loop, an unthrottled backfill, or a leaked key.REST API (/v1)
Every REST response reports your current budget in headers:
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Requests allowed per minute for your organization. |
X-RateLimit-Remaining | Tokens left in the bucket right now. |
X-RateLimit-Reset | Unix epoch seconds when the bucket is fully refilled. |
429 Too Many Requests with detail: "rate_limit_exceeded" and a Retry-After header (integer seconds to wait):
429 as an OctogenAPIError whose status_code (Python) / statusCode (TypeScript) equals 429. Catch it, wait, and retry with backoff:
MCP
MCP tool calls draw on the same per-organization budget as the REST API. Only tool calls count toward it; theinitialize and tools/list handshakes do not.
When the budget is exhausted, a tool call fails with an McpError (JSON-RPC error code -32029) instead of a tool-level value:
retry_after is the number of seconds to wait before the next tool call. Compliant MCP clients surface this to the agent. Because the budget is shared, heavy API-key traffic can throttle interactive agents and vice versa.
Best practices
- Honor
Retry-After. Wait at least that long before retrying, then add exponential backoff with jitter for repeated429s. - Throttle bulk work. For backfills or batch jobs, cap your concurrency and add a small delay between requests instead of firing them all at once.
- Share the budget deliberately. Because API keys and MCP sessions share one per-org pool, a heavy backend job can starve your interactive agents. Schedule large jobs accordingly.
- Ask for more if you need it. If your steady-state load approaches the limit, contact Octogen support to raise your organization’s allowance rather than retrying through
429s.