Crawlstack - Browser Infrastructure for AI Agents

If you're building web scrapers, AI agents, or browser automation at any real scale, you've probably run into the infrastructure problem: managing browser instances is painful. Chrome eats RAM, headless browsers leak memory, and running dozens of concurrent sessions requires careful orchestration.

Browserless and Browserbase solve this by hosting Chrome for you. You connect to their managed browsers via WebSocket and control them with Puppeteer or Playwright. It's a clean model — outsource the browser infrastructure, keep your automation code.

Crawlstack takes a fundamentally different approach. Instead of giving you a remote browser to control, it runs your scraping logic inside a browser you own — either your Chrome installation (via extension) or a stealth-hardened Chromium container (via Docker). The browser isn't infrastructure you connect to; it's the platform your code runs on.

These are meaningfully different architectures with different tradeoffs. Here's an honest comparison.

Architecture

Browserless provides hosted Chrome instances accessible via WebSocket. You write Puppeteer or Playwright code locally, but instead of launching a local browser, you connect to Browserless's servers:

Your Script ──WebSocket──> Browserless Cloud ──> Chrome Instance ──> Target Site

They also offer REST APIs for common operations (screenshots, PDFs, content extraction) and a self-hosted Docker option. Pricing is based on browser-minutes — how long your Chrome sessions run.

Browserbase is architecturally similar but with a more modern focus. They provide hosted Chromium sessions with built-in stealth, session management, and debugging tools. Their pitch emphasizes AI agent use cases — giving autonomous agents reliable browser access. They also integrate with Stagehand for higher-level browser interaction.

Your Script ──CDP──> Browserbase Cloud ──> Chromium Session ──> Target Site

Crawlstack doesn't give you a browser to connect to. It runs your scraping code inside the browser itself:

Your Browser/Docker ──> Tab Opens Target ──> Your Script Runs In-Page ──> Data Published

There's no WebSocket connection, no remote browser, no network hop between your code and the page. Your script has direct DOM access in the page context.

Feature Comparison

Feature	Browserless	Browserbase	Crawlstack
Architecture	Hosted Chrome (WebSocket)	Hosted Chromium (CDP)	Self-hosted browser
Cost	~$0.01–0.03/min	~$0.01–0.03/min	Free
Client API	Puppeteer/Playwright WS	Playwright Connect	REST + MCP
Stealth	Stealth plugins	Built-in stealth mode	Real browser (native)
Session persistence	Limited	Session management	Full browser profile
Data pipeline	None	None	Built-in (storage, dedup, webhooks)
Debugging	Remote DevTools	Live view + session recording	DevTools + flight recorder
AI agent support	Via Puppeteer/Playwright	Session-based, Stagehand integration	18 MCP tools
Self-hosted option	Yes (Docker)	No	Yes (Docker/extension)
Scheduling	None	None	Built-in
Distributed	Pay for more sessions	Pay for more sessions	Multi-node clustering

Cost: Per-Minute vs. Free

This is the most straightforward difference.

Browserless charges per browser-minute. Their pricing starts around $0.01–0.03 per minute depending on plan and concurrency. A scraping job that takes 30 seconds costs about $0.005–0.015. That sounds cheap until you're running thousands of sessions daily — 10,000 sessions averaging 30 seconds each would cost $50–150/day, or $1,500–4,500/month.

Browserbase has similar per-minute pricing. Their free tier includes limited browser-minutes, with paid plans scaling based on usage and concurrency.

Crawlstack is free. You run it on your own hardware. A $5/month VPS can run Crawlstack in Docker, handling the same workload that would cost thousands on Browserless or Browserbase. The tradeoff: you manage the infrastructure.

For low-volume use cases (a few hundred sessions per month), the hosted services are convenient and affordable. At scale, the cost difference is enormous.

Stealth and Detection

Browserless runs standard Chrome instances. They offer some stealth configurations, but fundamentally you're connecting to a headless Chrome in a datacenter. Stealth plugins (puppeteer-extra-plugin-stealth) help, but sophisticated anti-bot systems can still detect the headless environment, datacenter IPs, and CDP-driven interaction patterns.

Browserbase has invested more in stealth. They offer a built-in stealth mode, fingerprint management, and residential proxy integration. Their approach is more sophisticated than Browserless for anti-bot evasion, but it's still a managed headless environment that determined bot detection systems can identify.

Crawlstack in extension mode uses your real browser — the same Chrome instance with your real fingerprint, your real IP, and your real browsing history. There's nothing to detect because nothing is automated at the browser level. Your scripts run as page JavaScript, indistinguishable from a user's own console scripts.

In Docker mode, Crawlstack runs Cloakbrowser — a stealth-hardened Chromium that passes common fingerprint checks. Combined with the built-in Cloudflare Turnstile solver and human simulation (runner.humanClick() with Bézier curves, runner.humanScrollInView()), it's significantly more resistant to detection than either hosted option.

Session and Auth Handling

This is a practical pain point that's easy to underestimate.

Browserless gives you fresh browser sessions by default. For authenticated scraping, you need to either inject cookies manually or maintain persistent sessions (limited support). Every new session starts clean — no login state, no cookies, no localStorage.

const puppeteer = require('puppeteer');

const browser = await puppeteer.connect({
  browserWSEndpoint: 'wss://chrome.browserless.io?token=YOUR_TOKEN'
});
const page = await browser.newPage();
await page.goto('https://example.com/dashboard');

// Must manually inject cookies for auth
await page.setCookie(...savedCookies);
await page.reload();

const data = await page.evaluate(() =>
  document.querySelector('.dashboard-data')?.innerText
);
await browser.close();

Browserbase has better session management. You can create named sessions that persist state between connections, and their API supports session resumption. This makes authenticated scraping easier, but you're still managing sessions through their API.

const { chromium } = require('playwright');

const browser = await chromium.connectOverCDP(
  'wss://connect.browserbase.com?apiKey=YOUR_KEY&sessionId=SESSION_ID'
);
const context = browser.contexts()[0];
const page = context.pages()[0];

await page.goto('https://example.com/dashboard');
const data = await page.evaluate(() =>
  document.querySelector('.dashboard-data')?.innerText
);

Crawlstack uses your actual browser sessions. If you're logged into a site in Chrome, your scraper has those sessions automatically — cookies, localStorage, IndexedDB, everything. No cookie extraction, no session APIs, no expiration handling.

// Already in your authenticated browser session
await runner.onLoad();

const data = document.querySelector('.dashboard-data')?.innerText;
await runner.publishItems([{
  id: 'dashboard-snapshot',
  version: new Date().toISOString().slice(0, 10),
  data: { content: data, capturedAt: new Date().toISOString() }
}]);

For scraping internal dashboards, admin panels, or any site where you have an active login, Crawlstack eliminates an entire category of session management complexity.

Data Pipeline

Browserless and Browserbase are browser infrastructure, not scraping platforms. They give you a browser to control — what you do with the data is your responsibility. You need to build or integrate:

Data storage (database, file system, cloud storage)
Deduplication logic
Scheduling (cron, Airflow, custom)
Webhook delivery
Error handling and retry logic
Monitoring

Crawlstack includes the full pipeline:

Storage: Built-in SQLite database with optional libSQL/Turso upgrade for cloud access
Deduplication: Automatic dedup with configurable change frequency and item versioning
Scheduling: Cron-style scheduling via UI or REST API
Webhooks: Deliver items to HTTP endpoints as they're extracted
Flight recorder: Screencast, DOM snapshots, and event recording for debugging
REST API: 40+ endpoints for managing crawlers, runs, and data
MCP tools: 18 tools for AI-agent-driven development

The pipeline difference is significant. With Browserless/Browserbase, the browser session is one component in a system you have to build. With Crawlstack, the browser session is embedded in a complete system.

AI Agent Support

All three tools are positioning themselves for the AI agent era, but in different ways.

Browserless is accessible to agents through Puppeteer/Playwright — the agent writes automation code and connects to a Browserless session. It works, but there's no agent-specific abstraction.

Browserbase has invested more here. Their Stagehand integration provides higher-level browser interaction (natural language selectors, AI-driven navigation), and their session management is designed for autonomous agents that need to maintain state across multiple interactions.

Crawlstack exposes 18 MCP (Model Context Protocol) tools specifically designed for AI agents:

list_nodes: Discover available browser nodes
extension_preview_script: Test a scraping script without saving it
extension_upsert_crawler: Create or update a crawler
extension_get_screenshot: Capture what the browser sees
extension_get_run_logs: Inspect execution logs

The workflow: an agent discovers available nodes, writes a scraping script, previews it with keep_alive: true to inspect results via screenshots and logs, iterates until the script works, then saves it as a persistent crawler. This tight feedback loop — write, preview, inspect, iterate, save — is purpose-built for how AI agents work.

Debugging

Browserless offers remote DevTools access — you can connect Chrome DevTools to a running session for live debugging. Useful, but requires catching the session while it's running.

Browserbase has a live viewer and session recording. You can watch sessions in real-time and replay recorded sessions. Their debugging experience is notably polished.

Crawlstack offers two debugging approaches:

DevTools: Since your script runs in a real browser tab, you have full Chrome DevTools access — breakpoints, network inspection, console, everything.
Flight recorder: Automatic screencast recording, DOM snapshots at each step, and full event recording. When a run fails, you can replay exactly what happened — what the page looked like, what the script did, and where it went wrong.

The flight recorder is particularly valuable for diagnosing anti-bot blocks and timing issues — you can see the exact state of the page when the script encountered a problem.

Self-Hosting

Browserless offers a Docker image for self-hosting. You run their server and connect to it locally. This eliminates the per-minute cost but requires managing the Browserless server infrastructure.

Browserbase does not offer a self-hosted option. It's a cloud service only.

Crawlstack is self-hosted by default. The Chrome extension runs on your machine. The Docker deployment uses Cloakbrowser (stealth-hardened Chromium) and can run on any machine that supports Docker. Multiple Docker nodes can connect to the relay server for distributed operation.

If data sovereignty, privacy, or cost control are priorities, Crawlstack and Browserless (self-hosted) are the options. Browserbase requires trusting their cloud.

Scalability

Browserless and Browserbase scale by allocating more concurrent browser sessions. You pay more, they spin up more Chrome instances. Clean scaling model — no infrastructure to manage — but costs scale linearly.

Crawlstack scales by adding browser nodes. Deploy Docker containers on additional machines, connect them to the relay server, and the cluster distributes work automatically. Each node runs an independent browser with full stealth. You manage the infrastructure, but there's no per-session ceiling — costs scale with hardware, not usage.

For bursty workloads (need 100 sessions now, 0 next hour), the hosted services are more cost-efficient. For sustained high-volume workloads, Crawlstack's fixed-cost infrastructure is dramatically cheaper.

When to Choose Browserless

You need managed browser infrastructure without operational overhead
You're using Puppeteer/Playwright and want to offload browser management
You need their REST APIs for screenshots, PDFs, and content extraction
Your volume fits within their pricing
You want a self-hosted option for browser infrastructure

When to Choose Browserbase

You're building AI agents that need reliable browser access
You want built-in stealth and session management
You need Stagehand integration for AI-driven browser interaction
You want polished debugging tools (live view, session replay)
You prefer a fully managed cloud service

When to Choose Crawlstack

You want to avoid per-minute browser costs
You need to scrape authenticated content using existing browser sessions
You want a full scraping pipeline (storage, dedup, scheduling, webhooks), not just browser access
You need the highest level of stealth (real browser fingerprint)
You want AI-agent integration via MCP tools with a preview-iterate-save workflow
You need to self-host everything, including data storage
You're running sustained high-volume scraping where fixed infrastructure costs beat per-minute pricing

Honest Tradeoffs

Browserless and Browserbase solve a real problem well: managing Chrome at scale is hard, and they make it easy. If you're happy with Puppeteer or Playwright and just need someone else to run the browsers, they're excellent choices.

Crawlstack is a different kind of tool. It's not "hosted Chrome" — it's a scraping platform that happens to use a browser as its runtime. The trade is more initial setup (installing an extension or deploying Docker) for more capability (full pipeline, native stealth, AI integration) and lower cost (free).

The choice often comes down to: do you want browser infrastructure, or do you want a scraping platform? They're related but different problems, and the right answer depends on what you're building.

Crawlstack is a self-hosted scraping infrastructure that runs inside your browser or Docker. Get started for free.

Browserless vs. Browserbase vs. Crawlstack: Browser Infrastructure Compared