Documentation web scraper
One command. Entire documentation site. Clean markdown output.
Training AI on documentation means copying hundreds of pages by hand. Or writing custom scrapers for each site. Neither scales.
Docpull runs once and pulls everything. HTML becomes markdown with YAML headers. Ready for fine-tuning, Claude skills, or offline reference.
Point it at a docs site. It crawls, converts, and organizes. Parallel processing handles large sites fast. Caching means re-runs only fetch new content.
Pre-built configs exist for Stripe, Next.js, React, and others.
Documentation web scraper
One command. Entire documentation site. Clean markdown output.
Training AI on documentation means copying hundreds of pages by hand. Or writing custom scrapers for each site. Neither scales.
Docpull runs once and pulls everything. HTML becomes markdown with YAML headers. Ready for fine-tuning, Claude skills, or offline reference.
Point it at a docs site. It crawls, converts, and organizes. Parallel processing handles large sites fast. Caching means re-runs only fetch new content.
Pre-built configs exist for Stripe, Next.js, React, and others.