Output formats
JSON, CSV, NDJSON. Sinks: filesystem, Postgres, S3-compatible.
From the dashboard
Each job has Download JSON and Download CSV buttons. The CSV builds a column union from all extracted rows so heterogeneous schemas still produce a flat table.
From the API
filed under · bash.bash
curl -b cookies.txt http://localhost:8000/api/jobs/JOB_ID/export.json -o results.json
curl -b cookies.txt http://localhost:8000/api/jobs/JOB_ID/export.csv -o results.csvStreaming with SSE
The /api/jobs/JOB_ID/events endpoint emits Server-Sent Events for live progress. Use it to drive UIs or fan out to downstream consumers as rows complete.
Direct DB access
By default everything lives in data/scrape.db. Tables: users, jobs, fetches, extracted. Raw HTML is content-addressed in data/raw/.