MCP Server¶
fetchkit ships as a Model Context Protocol (MCP) server, enabling AI agents to fetch, scrape, extract, and download web content autonomously.
flowchart LR
subgraph Clients
Claude["🤖 Claude Desktop"]
CC["⌨️ Claude Code"]
LC["🦜 LangChain Agent"]
Custom["🔧 Any MCP Client"]
end
subgraph fetchkit MCP Server
Tools["16 Tools"]
Resources["4 Resources"]
Prompts["4 Prompts"]
end
subgraph pyfetcher Core
Fetch["FetchService"]
Scrape["Scrape Utils"]
Headers["Browser Profiles"]
Extract["Extractors"]
DL["Downloaders"]
end
Claude -- "stdio" --> Tools
CC -- "stdio" --> Tools
LC -- "HTTP" --> Tools
Custom -- "HTTP/stdio" --> Tools
Tools --> Fetch
Tools --> Scrape
Tools --> Headers
Tools --> Extract
Tools --> DL
style Tools fill:#E74C3C,color:#fff
style Resources fill:#3498DB,color:#fff
style Prompts fill:#9B59B6,color:#fff
style Fetch fill:#2ECC71,color:#fff
Installation¶
pip install 'fetchkit[mcp]'
Running¶
stdio (Claude Desktop, Claude Code):
pyfetcher-mcp
HTTP (LangChain, remote agents):
pyfetcher-mcp --http 8000
Makefile:
make mcp # stdio
make mcp-http # HTTP on port 8000
Tools (16)¶
Tool |
Description |
Returns |
|---|---|---|
|
Fetch URL with browser headers |
FetchResult |
|
Batch fetch with concurrency |
list[FetchResult] |
|
CSS selector extraction |
ScrapeResult |
|
Link harvesting with classification |
LinksResult |
|
Readable text extraction |
str |
|
Page metadata + Open Graph |
MetadataResult |
|
Form field extraction |
FormsResult |
|
HTML table parsing |
TableResult |
|
robots.txt rule checking |
RobotsResult |
|
XML sitemap parsing |
SitemapResult |
|
Browser header preview |
HeadersResult |
|
All browser profiles |
list[ProfileInfo] |
|
Random UA string |
str |
|
Article text + markdown |
ArticleResult |
|
HTML to markdown/plaintext |
str |
|
File download with checksum |
DownloadResult |
Resources¶
pyfetcher://profiles– All browser profiles with detailspyfetcher://profiles/{name}– Headers for a specific profilepyfetcher://backends– Backend capabilities tablepyfetcher://version– Version and feature info
Prompts¶
web_research(topic)– Research a topic by fetching and analyzing pagessite_audit(url)– Audit a website’s structure, metadata, and linksscrape_guide(url, goal)– Guide for scraping specific datacompare_pages(url1, url2)– Compare two web pages
Claude Desktop Configuration¶
Add to claude_desktop_config.json:
{
"mcpServers": {
"pyfetcher": {
"command": "pyfetcher-mcp",
"args": []
}
}
}
LangChain Integration¶
from langchain_mcp_adapters import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
client = MultiServerMCPClient({
"pyfetcher": {"transport": "http", "url": "http://localhost:8000/mcp"}
})
tools = await client.get_tools()
agent = create_react_agent(model, tools)