/ DISPATCHES
Notes from
the trenches.
Long-form posts on anti-bot bypass, scraping at scale, and shipping data pipelines.
001
ENGINEERING·Apr 24, 2026·8 MIN
Why your scraper is still getting flagged at the TCP layer
JA4+ killed the static fingerprint hash. Here's what replaced it and how curl-impersonate keeps up — TLS 1.3, extension permutation, and HTTP/2 frame ordering explained.
Read dispatch →
002
PATTERNS·Apr 17, 2026·6 MIN
Tier routing: the mental model that cuts scraping costs by 80%
Stop running every URL through a headless browser. Tier routing — start cheap, escalate only when blocked — saves 60× on bandwidth and CPU at scale.
Read dispatch →
003
AI·Apr 10, 2026·5 MIN
Schema-driven extraction with Claude — and a 90% prompt cache
How we cut LLM extraction cost from $30 per 1k pages to $3 with one Anthropic feature. A practical guide to prompt caching for structured data extraction.
Read dispatch →
004
COMPLIANCE·Apr 3, 2026·10 MIN
Scraping ethically in 2026 (post-hiQ, post-AI Act)
What changed legally, what didn't, and what defaults every scraper should ship with. A field guide to robots.txt, GDPR, the EU AI Act, and the post-Clearview landscape.
Read dispatch →