Scaling
Concurrency tuning, queue partitioning, cost controls.
Concurrency tuning
Two knobs matter: CRAWL_MAX_CONCURRENCY (total in-flight requests) and CRAWL_PER_HOST_CONCURRENCY (per-domain cap). Start at 16 / 2; raise the per-host cap only when target sites tolerate it.
Queue partitioning
Redis Streams support consumer groups. Partition by host hash so the same domain always hits the same worker — keeps the per-host limiter accurate and warms HTTP/2 connections.
Cost controls
- Cap max_tier per job to avoid surprise CAPTCHA spend
- Set per-host min_delay to avoid thundering rate-limit walls
- Use per-site selectors instead of LLM for known sites — 100× cheaper
- Enable prompt caching on Claude — 90% input-token savings
Measured benchmarks
Numbers from the production-shape soak harness against a 15-target mixed-vendor URL list (baselines, Cloudflare, DataDome, Akamai, Reddit interstitial). Single host, 6 concurrent workers. Reproduce with scripts/soak.py. Excludes robots.txt-blocked URLs from the success-rate denominator.
| Metric | 8.6-min run | 55-min run |
|---|---|---|
| Fetches | 75 | 450 |
| Bandwidth | 13.2 MB | 70.8 MB |
| Success rate | 85% | 89% |
| Tier 0 p50 / p95 | 1.9 s / 2.8 s | 2.0 s / 3.0 s |
| Tier 1 p50 / p95 | 91 s / 102 s | 75 s / 116 s |
| RSS peak / median / end | 3.75 / 2.96 / 2.79 GB | 4.35 / 3.20 / 1.60 GB |
What this tells you about scaling:
- No memory leak. The 55-minute run ended at 1.60 GB — well below the 3.20 GB median during the run — meaning the browser pool actively releases instances when concurrency drops. The 8.6-minute run plateaued at 2.79 GB because it never had enough idle time to recover.
- Tier 0 latency stays flat. p50 and p95 moved less than 8% across a 6× longer run. p99 did grow (3.1 s → 10.6 s) — under sustained load you see more rare slow proxy hops; budget for that in your timeouts.
- Tier 1 p50 got faster (-17%) because the browser pool keeps instances warm longer in a sustained run. Tier 1 p95 grows because you cycle through more cold-starts.
- Per-Camoufox memory. ~700–900 MB resident per warm browser. The pool sizes itself to
max_concurrency / 4; cap concurrency to bound total RAM.
When to add Tier 3
If your target list contains sites with behavioral scoring (PerimeterX, advanced Akamai, Kasada), the free FlareSolverr Tier 3 will lose. Switch to UNBLOCK_PROVIDER=brightdata or scrapfly — both run real browser farms purpose-built for this. Cost is per-success only; failed requests aren't billed by Bright Data.