API Endpoints
Complete reference for all ScrapeHub API endpoints
Base URL
https://api.scrapehub.io/v4All endpoints are relative to this base URL.
Authentication
All API requests require authentication via API key in the header:
TerminalX-API-KEY: sk_live_xxxx_449x
POST /scrape
Create a new scraping job and get results synchronously.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| target | string | Yes | URL to scrape |
| engine | string | No | Scraper engine (default: neural-x1) |
| format | string | No | Output format: json, csv, xml (default: json) |
| render_js | boolean | No | Render JavaScript (default: true) |
| wait_for | string | No | CSS selector to wait for |
| webhook_url | string | No | Webhook URL for job completion |
Example Request
Terminalcurl -X POST https://api.scrapehub.io/v4/scrape \ -H "X-API-KEY: sk_live_xxxx_449x" \ -H "Content-Type: application/json" \ -d '{ "target": "https://example.com/products", "engine": "neural-x1", "format": "json", "render_js": true, "wait_for": ".product-list" }'
Response
{
"job_id": "job_abc123xyz",
"status": "completed",
"engine": "neural-x1",
"target": "https://example.com/products",
"created_at": "2026-01-27T10:30:00Z",
"completed_at": "2026-01-27T10:30:15Z",
"duration": 15.2,
"pages_scraped": 1,
"records_count": 247,
"results": [
{
"name": "Product Name",
"price": 29.99,
"rating": 4.5,
"url": "https://example.com/product/123"
}
]
}POST /jobs
Create an async scraping job (returns immediately with job ID).
Terminalcurl -X POST https://api.scrapehub.io/v4/jobs \ -H "X-API-KEY: sk_live_xxxx_449x" \ -H "Content-Type: application/json" \ -d '{ "target": "https://example.com/large-dataset", "engine": "neural-x1" }'
Response
{
"job_id": "job_abc123xyz",
"status": "pending",
"created_at": "2026-01-27T10:30:00Z"
}GET /jobs/:job_id
Get the status and results of a scraping job.
Terminalcurl -X GET https://api.scrapehub.io/v4/jobs/job_abc123xyz \ -H "X-API-KEY: sk_live_xxxx_449x"
Response
{
"job_id": "job_abc123xyz",
"status": "completed",
"progress": 100,
"engine": "neural-x1",
"target": "https://example.com/products",
"created_at": "2026-01-27T10:30:00Z",
"started_at": "2026-01-27T10:30:05Z",
"completed_at": "2026-01-27T10:35:20Z",
"duration": 315.5,
"pages_scraped": 50,
"records_count": 1247,
"results": [ /* ... */ ]
}GET /jobs/:job_id/export
Export job results in various formats.
Query Parameters
| format | json, csv, xml, parquet (default: json) |
| compress | gzip, zip (optional compression) |
Terminal# Export as CSV curl -X GET "https://api.scrapehub.io/v4/jobs/job_abc123xyz/export?format=csv" \ -H "X-API-KEY: sk_live_xxxx_449x" \ -o results.csv # Export as compressed JSON curl -X GET "https://api.scrapehub.io/v4/jobs/job_abc123xyz/export?format=json&compress=gzip" \ -H "X-API-KEY: sk_live_xxxx_449x" \ -o results.json.gz
GET /jobs
List all scraping jobs for your account.
Query Parameters
| limit | Number of jobs to return (default: 10, max: 100) |
| offset | Pagination offset (default: 0) |
| status | Filter by status: pending, active, completed, failed |
Terminalcurl -X GET "https://api.scrapehub.io/v4/jobs?limit=20&status=completed" \ -H "X-API-KEY: sk_live_xxxx_449x"
DELETE /jobs/:job_id
Cancel a running job or delete a completed job.
Terminalcurl -X DELETE https://api.scrapehub.io/v4/jobs/job_abc123xyz \ -H "X-API-KEY: sk_live_xxxx_449x"
Response
{
"success": true,
"message": "Job deleted successfully"
}Rate Limits
- Free Plan: 1,000 requests/month, 10 requests/minute
- Starter Plan: 10,000 requests/month, 60 requests/minute
- Growth Plan: 100,000 requests/month, 120 requests/minute
- Enterprise Plan: Unlimited requests, custom rate limits
See Rate Limits documentation for more details.