Scrape the toughest websites.Without limits, on your own infrastructure.

Start with single command, scale as you need.

JavaScript
Regular JavaScript
document.querySelector()
SQLite
Built-in SQLite
10k items/sec, no server
Webhooks
Push data in real-time
Docker
Docker
Scale to a cluster
Quick Start

Install in one command

Run a single Docker command and Crawlstack is ready. The container includes headless Chrome with Crawlstack built in. Dashboard at localhost:3002.

Prefer your own browser? Install the Chrome extension instead.

terminal
$ docker run -p 3002:3002 ghcr.io/crawlstack/cloakbrowser-node:latest

Dashboard at localhost:3002. Includes headless Chrome with anti-detection.

extraction.js
const books = document.querySelectorAll("article.product_pod");
const items = Array.from(books).map(book => ({
  id: book.querySelector("h3 a")?.href,
  data: {
    title: book.querySelector("h3 a")?.getAttribute("title"),
    price: book.querySelector(".price_color")?.textContent,
  }
}));

await runner.publishItems(items);

// Follow pagination
const nextBtn = document.querySelector("li.next a");
if (nextBtn) {
  await runner.addTasks([{ url: nextBtn.href }]);
}
Developer Experience

Write plain JavaScript

If you can write document.querySelector(), you can use Crawlstack. No new framework. No special SDK. Your script runs directly in the page with full access to the DOM, cookies, and session state.

Use runner.publishItems() to store data and runner.addTasks() to follow pages. That's the entire API.

Performance

Up to 10,000 items per second

SQLite via WASM runs directly in the browser. No network round-trips to a database server. Millions of rows without browser slowdown. Export as CSV, JSON, or raw SQLite.

Scale to multiple nodes with Turso or libSQL when you need to.

#TitleAuthorPrice
4816Wuthering HeightsEmily Brontë$9.99
4817The Great GatsbyF. Scott Fitzgerald$12.99
4818To Kill a MockingbirdHarper Lee$9.49
48191984George Orwell$11.99
4820Pride and PrejudiceJane Austen$7.99
4821The Catcher in the RyeJ.D. Salinger$10.49
4822Brave New WorldAldous Huxley$8.99
4823The HobbitJ.R.R. Tolkien$14.99
Extracted 4,816 itemsSQLite WASM · 0ms latency
Cloudflare Turnstile
Verify you are human
Normal bot
isTrusted: false
Crawlstack
isTrusted: true

Hardware-level trusted clicks via CDP Input.dispatchMouseEvent

Anti-Detection

Turnstile solver built in

Cloudflare Turnstile checks if clicks are hardware-generated. Crawlstack dispatches clicks through Chrome DevTools Protocol at the OS level, so isTrusted is always true.

Built for Cloudflare Turnstile. Works with any challenge that checks event trust level.

Debugging

Flight recorder for every run

Full DOM snapshots, network logs, console output, and screenshots captured automatically. Replay any run offline to understand exactly what happened.

No more guessing why a crawler failed. See every step it took.

DOM Snapshots
Network
Console
Screenshots
0s0.5s1.0s1.5s2.0s2.5s3.0s
Navigate
DOM Snap
Extract
Network
Console
3 pages crawled47 items extracted2.8s total
Mouse Path Simulation
Buy Now
Natural Bezier curvesVariable speed + jitterTrusted click events
Stealth

Move like a human

Random mouse paths with natural Bezier curves, variable speed, and timing jitter. Every click, scroll, and hover generates real isTrusted input events.

Bot detection that relies on mouse behavior analysis sees a real person.

Integration

Webhooks push data in real time

Crawlstack detects new, changed, and deleted items automatically then pushes the diff to your endpoints in real-time. Build live data pipelines without polling.

Integrate with Zapier, n8n, Make, or your own backend.

Webhook delivery log
16:53:53NEWapi.example.com/ingestPOST200
16:53:56CHANGEDapi.example.com/ingestPOST200
16:53:59NEWapi.example.com/ingestPOST200
16:54:02DELETEDapi.example.com/ingestPOST200
# Detects new, changed, and deleted items
POST https://api.example.com/ingest
Content-Type: application/json
{ "event": "item.changed", "items": [...] }
MCP Conversation
Extract all product prices from books.toscrape.com
AI Agents

MCP native from day one

Crawlstack exposes its full API as Model Context Protocol tools. Any AI agent (Claude, GPT, custom) can discover nodes, preview scripts, launch runs, and retrieve data.

Your agent gets a real browser it can control programmatically.

Remote Access

Dashboard from anywhere

The relay server tunnels your local dashboard to a public URL. Monitor runs, view data, and manage crawlers from any device without port forwarding or VPN.

Secure WebSocket tunnel over TLS. No data stored on relay.

localhost:3002
relay.crawlstack.dev
relay.crawlstack.dev/node/abc123/dashboard
12
Crawlers
3
Running
48.2k
Items
Run progress: page 37 of 50
Task Queue
2,847 tasks
Node 1
active
Node 2
active
Node 3
active
Auto-dedup across all nodes via shared Turso DB
Scale

Distributed runs across nodes

Same crawler script, multiple Docker nodes. Tasks are distributed automatically with shared queues. Built-in deduplication ensures no page is crawled twice, even across nodes.

Zero code changes to go from one node to a cluster. Just add containers.

AI Extraction

Extract with LLMs

Point an LLM at the whole page or just the parts you care about and get structured data back. No brittle CSS selectors that break when the site changes.

Works with OpenAI, Anthropic, Google, or any OpenAI-compatible endpoint. Pass a JSON schema for typed output.

crawler.js
// no selectors, just describe what you need
const { data } = await runner.extract({
modelId: 'gpt-4o',
instruction: 'Extract every product with name,
price, and availability',
selector: '.product-grid',
format: productSchema
})
// data is typed JSON, no parsing needed
runner.publishItems(data.products)
Semantic Searchcosine similarity
query
"affordable outdoor camping gear under $50"
0.94
Budget Camping Tent — 2 Person
Lightweight backpacking tent, waterproof...
0.91
Compact Sleeping Bag — 3 Season
Rated to 30°F, packs down small...
0.87
Portable Camp Stove — Butane
Boils water in 3 minutes, folds flat...
RAG-Ready

Semantic search for your RAG

Every item you crawl can be automatically embedded. Query your data by meaning, not keywords. Built-in vector storage and cosine similarity search with no external vector DB needed.

Ships with e5-small for instant embeddings out of the box. Plug in OpenAI, Cohere, or any embedding API for production workloads.

Ready-to-use Templates

Just put your crawler on GitHub and anyone can use it in a single click

Amundi

Details

Example.com

Example crawler.

Details

Mac-Point.cz

Extract M MacBooks (description, params) for finding new offers with like new condition using AI filtering.

Details

Sitemap example

Load links from sitemap.

Details

StartupJobs.cz

Extract job offers for selected filters

Details

Frequently Asked Questions

Ready to get started?

Plain JavaScript. Built-in SQLite. Webhooks. Docker. No new framework to learn.