Core concepts
The mental model: tiers, sessions, fingerprints, escalation.
Tier
A tier is a transport strategy with a cost / capability tradeoff. Lower tiers are cheaper and faster but bypass less. Higher tiers handle harder targets.
- Tier 0 — HTTP: real-Chrome TLS via curl_cffi. ~50ms.
- Tier 1 — Browser: Camoufox (Firefox + C++ patches). ~3s.
- Tier 2 — CAPTCHA: browser + token-injection solver. ~15s.
- Tier 3 — Unblock: managed third-party fallback. ~20s.
Session
A session is the identity tuple (host, fingerprint, proxy_id). Cookies and storage state are scoped to a session — never shared across tuples. Crossing IPs with the same cookie jar is the canonical "shared account" signal that gets you flagged.
Fingerprint bundle
A coherent set of identifiers that mimic a real device: User-Agent, screen size, timezone, locale, WebGL vendor, fonts, hardware concurrency. They're sourced from real devices — mixing fields across bundles (macOS UA + Windows screen) is the #1 fingerprint mistake.
Escalation
The router runs every URL at the cheapest tier first. If a block is detected (status code, challenge HTML signature, soft-200 with empty body), it promotes the URL to the next tier and re-queues. ~80% of requests clear at Tier 0–1.
Block detection
Every response passes through a cheap regex/header pipeline that flags Cloudflare challenges, DataDome, PerimeterX, hCaptcha/reCAPTCHA presence, 4xx/5xx, rate limiting, and undersized HTML bodies. The signal determines the escalation target.