Rate Limits

Understand and manage your API request limits

About Rate Limits

Rate limits protect our infrastructure and ensure fair usage across all customers. Limits are applied per API key and reset at regular intervals based on your plan tier.

Plan Tiers

Free

1,000 requests/month

10 requests/minute

Starter

50,000 requests/month

100 requests/minute

Professional

500,000 requests/month

500 requests/minute

Enterprise

Custom limits

Custom rate limits

Rate Limit Headers

Every API response includes headers that help you track your current rate limit status:

X-RateLimit-Limit

The maximum number of requests you can make per time window

X-RateLimit-Remaining

The number of requests remaining in the current time window

X-RateLimit-Reset

Unix timestamp when the rate limit resets

Retry-After

Seconds to wait before making another request (only present when rate limited)

Example Response Headers

Terminal
HTTP/1.1 200 OK X-RateLimit-Limit: 100 X-RateLimit-Remaining: 87 X-RateLimit-Reset: 1706313600 Content-Type: application/json

Handling Rate Limits

429 Too Many Requests

When you exceed your rate limit, the API returns a 429 status code:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Please try again in 42 seconds.",
    "type": "rate_limit_error"
  }
}

Python Implementation

rate_limit_handler.py
python
import time
from scrapehub import ScrapeHubClient
from scrapehub.exceptions import RateLimitError

client = ScrapeHubClient(api_key="your_api_key")

def scrape_with_retry(url, max_retries=3):
    retries = 0

    while retries < max_retries:
        try:
            result = client.scrape(url)

            # Check remaining requests
            remaining = result.rate_limit_remaining
            if remaining < 10:
                print(f"Warning: Only {remaining} requests remaining")

            return result

        except RateLimitError as e:
            retries += 1
            if retries >= max_retries:
                raise

            # Get retry-after from error or use exponential backoff
            wait_time = e.retry_after or (2 ** retries)
            print(f"Rate limited. Waiting {wait_time} seconds...")
            time.sleep(wait_time)

    raise Exception("Max retries exceeded")

# Usage
result = scrape_with_retry("https://example.com")

Node.js Implementation

rate_limit_handler.js
javascript
const { ScrapeHubClient } = require('@scrapehub/node');

const client = new ScrapeHubClient({
  apiKey: process.env.SCRAPEHUB_API_KEY
});

async function scrapeWithRetry(url, maxRetries = 3) {
  let retries = 0;

  while (retries < maxRetries) {
    try {
      const result = await client.scrape({ url });

      // Check rate limit headers
      const remaining = result.rateLimitRemaining;
      if (remaining < 10) {
        console.warn(`Warning: Only ${remaining} requests remaining`);
      }

      return result;

    } catch (error) {
      if (error.code === 'rate_limit_exceeded') {
        retries++;
        if (retries >= maxRetries) {
          throw error;
        }

        // Use retry-after header or exponential backoff
        const waitTime = error.retryAfter || (2 ** retries);
        console.log(`Rate limited. Waiting ${waitTime} seconds...`);
        await new Promise(resolve => setTimeout(resolve, waitTime * 1000));
      } else {
        throw error;
      }
    }
  }

  throw new Error('Max retries exceeded');
}

// Usage
scrapeWithRetry('https://example.com')
  .then(result => console.log(result.data))
  .catch(error => console.error(error));

Best Practices

Rate Limit Best Practices

  • Monitor the X-RateLimit-Remaining header
  • Implement exponential backoff when rate limited
  • Cache responses when possible to reduce API calls
  • Use batch endpoints to process multiple items in one request
  • Spread requests evenly throughout the time window
  • Consider upgrading your plan if you frequently hit limits

Rate Limit Optimization

Batch Requests

Process multiple URLs in a single API call to maximize efficiency:

batch_scraping.py
python
from scrapehub import ScrapeHubClient

client = ScrapeHubClient(api_key="your_api_key")

# Instead of multiple requests
# result1 = client.scrape("https://example.com/page1")
# result2 = client.scrape("https://example.com/page2")
# result3 = client.scrape("https://example.com/page3")

# Use batch endpoint (counts as 1 request)
results = client.batch_scrape([
    {"url": "https://example.com/page1"},
    {"url": "https://example.com/page2"},
    {"url": "https://example.com/page3"}
])

for result in results:
    print(result.data)

Response Caching

caching_example.py
python
import time
from functools import lru_cache
from scrapehub import ScrapeHubClient

client = ScrapeHubClient(api_key="your_api_key")

# Cache results for 1 hour
@lru_cache(maxsize=100)
def scrape_cached(url, cache_key=None):
    result = client.scrape(url)
    return result.data

# Use current hour as cache key
cache_key = int(time.time() / 3600)

# This will use cached result if available
data1 = scrape_cached("https://example.com", cache_key)
data2 = scrape_cached("https://example.com", cache_key)  # Uses cache

print(f"Data retrieved: {data1}")

Important Notes

  • Rate limits are enforced per API key, not per account
  • Test keys have the same rate limits as your plan tier
  • Exceeding limits repeatedly may result in temporary suspension
  • Enterprise customers can request custom rate limits