Skip to content

usenix_scrape

Module: src.scrapers.usenix_scrape
Category: Scrapers

Usage

python -m src.scrapers.usenix_scrape [options]

Options

usage: usenix_scrape.py [-h] --conference CONFERENCE --years YEARS
                        [--format {json,yaml,summary}]
                        [--max-workers MAX_WORKERS] [--delay DELAY]
                        [--all-papers]

Scrape USENIX conference pages for paper titles and artifact badges.

options:
  -h, --help            show this help message and exit
  --conference CONFERENCE, -c CONFERENCE
                        Conference short name(s), comma-separated (e.g. fast,
                        osdi, atc)
  --years YEARS, -y YEARS
                        Year(s) to scrape, comma-separated (e.g. 2024,2025)
  --format {json,yaml,summary}, -f {json,yaml,summary}
                        Output format (default: summary)
  --max-workers MAX_WORKERS
                        Max parallel requests per conference/year (default: 4)
  --delay DELAY         Delay in seconds between requests (default: 0.3)
  --all-papers          Include papers without badges in output (default: only
                        badged papers)