MCP Server¶

fetchkit ships as a Model Context Protocol (MCP) server, enabling AI agents to fetch, scrape, extract, and download web content autonomously.

        flowchart LR
    subgraph Clients
        Claude["🤖 Claude Desktop"]
        CC["⌨️ Claude Code"]
        LC["🦜 LangChain Agent"]
        Custom["🔧 Any MCP Client"]
    end

    subgraph fetchkit MCP Server
        Tools["16 Tools"]
        Resources["4 Resources"]
        Prompts["4 Prompts"]
    end

    subgraph pyfetcher Core
        Fetch["FetchService"]
        Scrape["Scrape Utils"]
        Headers["Browser Profiles"]
        Extract["Extractors"]
        DL["Downloaders"]
    end

    Claude -- "stdio" --> Tools
    CC -- "stdio" --> Tools
    LC -- "HTTP" --> Tools
    Custom -- "HTTP/stdio" --> Tools

    Tools --> Fetch
    Tools --> Scrape
    Tools --> Headers
    Tools --> Extract
    Tools --> DL

    style Tools fill:#E74C3C,color:#fff
    style Resources fill:#3498DB,color:#fff
    style Prompts fill:#9B59B6,color:#fff
    style Fetch fill:#2ECC71,color:#fff

Installation¶

pip install 'fetchkit[mcp]'

Running¶

stdio (Claude Desktop, Claude Code):

pyfetcher-mcp

HTTP (LangChain, remote agents):

pyfetcher-mcp --http 8000

Makefile:

make mcp          # stdio
make mcp-http     # HTTP on port 8000

Tools (16)¶

Tool	Description	Returns
`fetch_url`	Fetch URL with browser headers	FetchResult
`fetch_multiple`	Batch fetch with concurrency	list[FetchResult]
`scrape_css`	CSS selector extraction	ScrapeResult
`scrape_links`	Link harvesting with classification	LinksResult
`scrape_text`	Readable text extraction	str
`scrape_metadata`	Page metadata + Open Graph	MetadataResult
`scrape_forms`	Form field extraction	FormsResult
`scrape_table`	HTML table parsing	TableResult
`check_robots`	robots.txt rule checking	RobotsResult
`parse_sitemap`	XML sitemap parsing	SitemapResult
`generate_headers`	Browser header preview	HeadersResult
`list_profiles`	All browser profiles	list[ProfileInfo]
`random_user_agent`	Random UA string	str
`extract_article`	Article text + markdown	ArticleResult
`convert_html`	HTML to markdown/plaintext	str
`download_file`	File download with checksum	DownloadResult

Resources¶

pyfetcher://profiles – All browser profiles with details
pyfetcher://profiles/{name} – Headers for a specific profile
pyfetcher://backends – Backend capabilities table
pyfetcher://version – Version and feature info

Prompts¶

web_research(topic) – Research a topic by fetching and analyzing pages
site_audit(url) – Audit a website’s structure, metadata, and links
scrape_guide(url, goal) – Guide for scraping specific data
compare_pages(url1, url2) – Compare two web pages

Claude Desktop Configuration¶

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "pyfetcher": {
      "command": "pyfetcher-mcp",
      "args": []
    }
  }
}

LangChain Integration¶

from langchain_mcp_adapters import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent

client = MultiServerMCPClient({
    "pyfetcher": {"transport": "http", "url": "http://localhost:8000/mcp"}
})
tools = await client.get_tools()
agent = create_react_agent(model, tools)