Name	Name	Last commit message	Last commit date
Latest commit History 58 Commits
.claude	.claude
.github/workflows	.github/workflows
assets	assets
skills/just-scrape	skills/just-scrape
src	src
tests	tests
.gitignore	.gitignore
CLAUDE.md	CLAUDE.md
LICENSE	LICENSE
README.md	README.md
SPEC.md	SPEC.md
biome.json	biome.json
bun.lock	bun.lock
package.json	package.json
tsconfig.json	tsconfig.json
tsup.config.ts	tsup.config.ts

just-scrape

Made with love by the ScrapeGraphAI team

Command-line interface for ScrapeGraph AI -- AI-powered web scraping, data extraction, search, and crawling.

Project Structure

Installation

npm install -g just-scrape # npm (recommended) pnpm add -g just-scrape # pnpm yarn global add just-scrape # yarn bun add -g just-scrape # bun npx just-scrape --help # or run without installing bunx just-scrape --help # bun equivalent

Package: just-scrape on npm.

Coding Agent Skill

You can use just-scrape as a skill for AI coding agents via Vercel's skills.sh with tis tutorial.

Or you can manually install it:

bunx skills add https://github.com/ScrapeGraphAI/just-scrape

Browse the skill: skills.sh/scrapegraphai/just-scrape/just-scrape

Configuration

The CLI needs a ScrapeGraph API key. Get one at dashboard.scrapegraphai.com.

Four ways to provide it (checked in order):

Environment variable: export SGAI_API_KEY="sgai-..."
.env file: SGAI_API_KEY=sgai-... in project root
Config file: ~/.scrapegraphai/config.json
Interactive prompt: the CLI asks and saves to config

Environment Variables

Variable	Description	Default
`SGAI_API_KEY`	ScrapeGraph API key	--
`JUST_SCRAPE_API_URL`	Override API base URL	`https://api.scrapegraphai.com/v1`
`JUST_SCRAPE_TIMEOUT_S`	Request/polling timeout in seconds	`120`
`JUST_SCRAPE_DEBUG`	Set to `1` to enable debug logging to stderr	`0`

JSON Mode (`--json`)

All commands support --json for machine-readable output. When set, banner, spinners, and interactive prompts are suppressed -- only minified JSON on stdout (saves tokens when piped to AI agents).

result.json just-scrape history smartscraper --json | jq '.requests[].status'">just-scrape credits --json | jq '.remaining_credits' just-scrape smart-scraper https://example.com -p "Extract data" --json > result.json just-scrape history smartscraper --json | jq '.requests[].status'

Smart Scraper

Extract structured data from any URL using AI. docs

Usage

just-scrape smart-scraper <url> -p <prompt> # Extract data with AI just-scrape smart-scraper <url> -p <prompt> --schema <json> # Enforce output schema just-scrape smart-scraper <url> -p <prompt> --scrolls <n> # Infinite scroll (0-100) just-scrape smart-scraper <url> -p <prompt> --pages <n> # Multi-page (1-100) just-scrape smart-scraper <url> -p <prompt> --stealth # Anti-bot bypass (+4 credits) just-scrape smart-scraper <url> -p <prompt> --cookies <json> --headers <json> just-scrape smart-scraper <url> -p <prompt> --plain-text # Plain text instead of JSON

Examples

# Extract product listings from an e-commerce page just-scrape smart-scraper https://store.example.com/shoes -p "Extract all product names, prices, and ratings" # Extract with a strict schema, scrolling to load more content just-scrape smart-scraper https://news.example.com -p "Get all article headlines and dates" \ --schema '{"type":"object","properties":{"articles":{"type":"array","items":{"type":"object","properties":{"title":{"type":"string"},"date":{"type":"string"}}}}}}' \ --scrolls 5 # Scrape a JS-heavy SPA behind anti-bot protection just-scrape smart-scraper https://app.example.com/dashboard -p "Extract user stats" \ --stealth

Search Scraper

Search the web and extract structured data from results. docs

Usage

just-scrape search-scraper <prompt> # AI-powered web search just-scrape search-scraper <prompt> --num-results <n> # Sources to scrape (3-20, default 3) just-scrape search-scraper <prompt> --no-extraction # Markdown only (2 credits vs 10) just-scrape search-scraper <prompt> --schema <json> # Enforce output schema just-scrape search-scraper <prompt> --stealth --headers <json>

Examples

# Research a topic across multiple sources just-scrape search-scraper "What are the best Python web frameworks in 2025?" --num-results 10 # Get raw markdown from search results (cheaper) just-scrape search-scraper "React vs Vue comparison" --no-extraction --num-results 5 # Structured output with schema just-scrape search-scraper "Top 5 cloud providers pricing" \ --schema '{"type":"object","properties":{"providers":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"free_tier":{"type":"string"}}}}}}'

Markdownify

Convert any webpage to clean markdown. docs

Usage

just-scrape markdownify <url> # Convert to markdown just-scrape markdownify <url> --stealth # Anti-bot bypass (+4 credits) just-scrape markdownify <url> --headers <json> # Custom headers

Examples

api-docs.md"># Convert a blog post to markdown just-scrape markdownify https://blog.example.com/my-article # Convert a JS-rendered page behind Cloudflare just-scrape markdownify https://protected.example.com --stealth # Pipe markdown to a file just-scrape markdownify https://docs.example.com/api --json | jq -r '.result' > api-docs.md

Crawl

Crawl multiple pages and extract data from each. docs

Usage

just-scrape crawl <url> -p <prompt> # Crawl + extract just-scrape crawl <url> -p <prompt> --max-pages <n> # Max pages (default 10) just-scrape crawl <url> -p <prompt> --depth <n> # Crawl depth (default 1) just-scrape crawl <url> --no-extraction --max-pages <n> # Markdown only (2 credits/page) just-scrape crawl <url> -p <prompt> --schema <json> # Enforce output schema just-scrape crawl <url> -p <prompt> --rules <json> # Crawl rules (include_paths, same_domain) just-scrape crawl <url> -p <prompt> --no-sitemap # Skip sitemap discovery just-scrape crawl <url> -p <prompt> --stealth # Anti-bot bypass

Examples

# Crawl a docs site and extract all code examples just-scrape crawl https://docs.example.com -p "Extract all code snippets with their language" \ --max-pages 20 --depth 3 # Crawl only blog pages, skip everything else just-scrape crawl https://example.com -p "Extract article titles and summaries" \ --rules '{"include_paths":["/blog/*"],"same_domain":true}' --max-pages 50 # Get raw markdown from all pages (no AI extraction, cheaper) just-scrape crawl https://example.com --no-extraction --max-pages 10

Sitemap

Get all URLs from a website's sitemap. docs

Usage

just-scrape sitemap <url>

Examples

# List all pages on a site just-scrape sitemap https://example.com # Pipe URLs to another tool just-scrape sitemap https://example.com --json | jq -r '.urls[]'

Scrape

Get raw HTML content from a URL. docs

Usage

just-scrape scrape <url> # Raw HTML just-scrape scrape <url> --stealth # Anti-bot bypass (+4 credits) just-scrape scrape <url> --branding # Extract branding (+2 credits) just-scrape scrape <url> --country-code <iso> # Geo-targeting

Examples

# Get raw HTML of a page just-scrape scrape https://example.com # Scrape a geo-restricted page with anti-bot bypass just-scrape scrape https://store.example.com --stealth --country-code DE # Extract branding info (logos, colors, fonts) just-scrape scrape https://example.com --branding

Agentic Scraper

Browser automation with AI -- login, click, navigate, fill forms. docs

Usage

just-scrape agentic-scraper <url> -s <steps> # Run browser steps just-scrape agentic-scraper <url> -s <steps> --ai-extraction -p <prompt> just-scrape agentic-scraper <url> -s <steps> --schema <json> just-scrape agentic-scraper <url> -s <steps> --use-session # Persist browser session

Examples

# Log in and extract dashboard data just-scrape agentic-scraper https://app.example.com/login \ -s "Fill email with user@test.com,Fill password with secret,Click Sign In" \ --ai-extraction -p "Extract all dashboard metrics" # Navigate through a multi-step form just-scrape agentic-scraper https://example.com/wizard \ -s "Click Next,Select Premium plan,Fill name with John,Click Submit" # Persistent session across multiple runs just-scrape agentic-scraper https://app.example.com \ -s "Click Settings" --use-session

Generate Schema

Generate a JSON schema from a natural language description.

Usage

just-scrape generate-schema <prompt> # AI generates a schema just-scrape generate-schema <prompt> --existing-schema <json>

Examples

# Generate a schema for product data just-scrape generate-schema "E-commerce product with name, price, ratings, and reviews array" # Refine an existing schema just-scrape generate-schema "Add an availability field" \ --existing-schema '{"type":"object","properties":{"name":{"type":"string"},"price":{"type":"number"}}}'

History

Browse request history for any service. Interactive by default -- arrow keys to navigate, select to view details, "Load more" for infinite scroll.

Usage

just-scrape history <service> # Interactive browser just-scrape history <service> <request-id> # Fetch specific request just-scrape history <service> --page <n> # Start from page (default 1) just-scrape history <service> --page-size <n> # Results per page (default 10, max 100) just-scrape history <service> --json # Raw JSON (pipeable)

Services: markdownify, smartscraper, searchscraper, scrape, crawl, agentic-scraper, sitemap

Examples

# Browse your smart-scraper history interactively just-scrape history smartscraper # Jump to a specific request by ID just-scrape history smartscraper abc123-def456-7890 # Export crawl history as JSON just-scrape history crawl --json --page-size 100 | jq '.requests[] | {id: .request_id, status}'

Credits

Check your credit balance.

just-scrape credits just-scrape credits --json | jq '.remaining_credits'

Validate

Validate your API key (health check).

just-scrape validate

Contributing

From Source

Requires Bun and Node.js 22+.

git clone https://github.com/ScrapeGraphAI/just-scrape.git cd just-scrape bun install bun run dev --help

Tech Stack

Concern	Tool
Language	TypeScript 5.8
Dev Runtime	Bun
Build	tsup (esbuild)
CLI Framework	citty (unjs)
Prompts	@clack/prompts
Styling	chalk v5 (ESM)
SDK	scrapegraph-js
Env	dotenv
Lint / Format	Biome
Target	Node.js 22+, ESM-only

Scripts

bun run dev # Run CLI from TS source bun run build # Bundle ESM to dist/cli.mjs bun run lint # Lint + format check bun run format # Auto-format bun run check # Type-check + lint

License

ISC

Made with love by the ScrapeGraphAI team

Folders and files

Latest commit

History

Repository files navigation

just-scrape

Project Structure

Installation

Coding Agent Skill

Configuration

Environment Variables

JSON Mode (--json)

Smart Scraper

Usage

Examples

Search Scraper

Usage

Examples

Markdownify

Usage

Examples

Crawl

Usage

Examples

Sitemap

Usage

Examples

Scrape

Usage

Examples

Agentic Scraper

Usage

Examples

Generate Schema

Usage

Examples

History

Usage

Examples

Credits

Validate

Contributing

From Source

Tech Stack

Scripts

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

JSON Mode (`--json`)

Packages