Observability
Prometheus metrics, OTel traces, structured logs.
Metrics
Every fetch increments per-tier counters and observes per-tier latency. Default metrics endpoint is :9090/metrics.
scrape_fetches_total{tier,block_reason,ok}
scrape_fetch_latency_seconds{tier} (histogram)
scrape_extracted_total{schema}
scrape_tier_escalations_total{from_tier,to_tier}
scrape_queue_size
scrape_active_browsers
# Cost telemetry — what each tier is actually costing you
scrape_proxy_bytes_total{tier} (residential proxy bandwidth)
scrape_solver_cost_usd_total{kind} (paid CAPTCHA spend)
# Operator-actionable signals
scrape_proxy_auth_failures_total (proxy 407 — broken creds)Per-fetch cost
Each fetch row carries proxy_bytes and solver_cost_usd in addition to status / tier / latency. The dashboard /jobs/:id view shows them per row; the orchestrator's stats() aggregates them as proxy_bytes_total and solver_cost_usd_total so customers see what a job actually spent before billing.
Proxy auth circuit breaker
Repeated proxy auth failures (HTTP 407) are tracked globally rather than per-session. After auth_failure_threshold (default 5) the ProxyManager raises ProxyAuthBroken and the orchestrator surfaces it on the failing URL. scrape_proxy_auth_failures_total is the corresponding counter — alert on a sustained non-zero rate to catch revoked credentials before a job spends an hour retrying nothing.
Logs
Structured JSON via structlog. Each log line carries url, tier, block_reason, elapsed_ms where relevant. Pipe to your aggregator of choice.
Dashboards
The included Grafana provisioning loads dashboards for per-domain success rate, tier mix, $/1k pages, and block-rate alerts. Open localhost:3001 after docker compose up.