Why do we have rate limits?
Rate limits safeguard the stability, reliability, and performance of our API ecosystem. Without limits, a single user or application could overwhelm backend services, degrade the experience for others, or even risk outages. Here are the main reasons for enforcing rate limits:- Fair resource allocation: Ensuring that no single user or application monopolizes system resources.
- System protection & stability: Preventing traffic spikes that might overload infrastructure or degrade latency.
- Security & abuse prevention: Mitigating denial-of-service (DoS) attempts, excessive scraping, or malicious behavior.
- Predictability: Giving all users a clear expectation of throughput, so clients can build appropriate retry/backoff logic.
Our Subscription Plans & Rate Limits
Each subscription plan we offer has a defined Requests Per Minute (RPM) limit. If you exceeds the allowed RPM, the system will enforce a 60-second cooldown during which no further requests will be accepted. After that cooldown period, new requests may resume (subject to rate limits again).| Plan Name | Max Requests Per Minute (RPM) | Behavior on Limit Breach |
|---|---|---|
| Solo | 60 | 60-second cooldown |
| Trio | 120 | 60-second cooldown |
| Unlimited | 360 | 60-second cooldown |
| Advanced-1 | 600 | 60-second cooldown |
| Advanced-2 | 900 | 60-second cooldown |
Cooldown Behavior & Retry Window
- Once you exceed the RPM threshold for your plan, all further requests within the next 60 seconds will be rejected with a rate-limit error.
- After the 60-second cooldown, your request count resets (for the new minute window), and you may resume using the API.
- It’s good practice for your client implementation to pause for at least 60 seconds after receiving a rate-limit response before retrying.
Error Response for Rate Limit Violation
If you exceed the allowed rate for your plan, the API responds with HTTP status 429 Too Many Requests and a JSON error object. Example:Best Practices for Managing Rate Limits
- Throttle your requests Instead of firing all requests at once, insert small delays between them. For example, for Solo (60 rpm), spread requests evenly (e.g. one request ~every 1 second).
- Implement exponential backoff with jitter If you get a 429 error, wait a bit, then retry. If you hit 429 again, increase the wait time (e.g. 1s, then 2s, then 4s), and optionally add randomness to avoid synchronized retries.
- Batch or consolidate requests where possible If your API supports multi-resource queries or bulk endpoints, use them to reduce total request count.
- Cache frequent or repeated data If multiple calls would yield identical results, cache responses on your side (for a suitable TTL) to reduce redundant hits.
- Monitor your request usage Log the number of requests per minute and watch for high usage patterns. Consider alerting if you approach your limit.
- Graceful fallback / queueing If your app is likely to burst over the limit, queue requests internally and pace them so they stay under the RPM ceiling.
- User-level quotas within your app If your app makes API calls on behalf of your users, consider enforcing per-user quotas so one internal “power user” doesn’t hog the shared rate budget.
Guidelines & Considerations
- Rate limits are enforced per API key + Bearer token combination. Different keys or tokens are counted separately.
- Failed requests still count toward your RPM limit. Repeated retries contribute to usage.
- The cooldown is absolute: when triggered, all further calls from that key / token will be rejected until 60 seconds elapse.
- You should design your application to expect and gracefully handle 429 responses (don’t assume they won’t happen).
- Consider splitting heavy tasks into smaller chunks spaced over time.