haive.mcp.documentation.doc_loader

MCP documentation loader for server discovery and setup extraction.

Loads, searches, and extracts setup information from the pre-indexed database of 1,960+ MCP servers. Includes lightweight GitHub README fetching via aiohttp (no external framework dependencies).

Example

from haive.mcp.documentation import MCPDocumentationLoader

loader = MCPDocumentationLoader()
all_docs = loader.load_all_mcp_documents()
print(f"Loaded {len(all_docs)} servers")

results = loader.search_servers_by_capability("database")
for server in results:
    info = loader.extract_setup_info(server)
    print(info["name"], info.get("install_command"))

Classes

MCPDocumentationLoader

Loads and processes MCP server documentation from the local database.

Module Contents

class haive.mcp.documentation.doc_loader.MCPDocumentationLoader(resources_path=None)[source]

Loads and processes MCP server documentation from the local database.

Initialize the documentation loader.

Parameters:

resources_path (pathlib.Path | None) – Path to the data directory containing MCP servers. Defaults to the package’s data/ directory.

extract_setup_info(server_doc)[source]

Extract setup information from server documentation.

Parameters:

server_doc (dict[str, Any]) – Server documentation dictionary.

Returns:

Extracted setup information including installation steps, configuration, and usage examples.

Return type:

dict[str, Any]

async fetch_github_readme(repo_url)[source]

Fetch README from a GitHub repository via the API.

Uses aiohttp directly – no external framework dependencies.

Parameters:

repo_url (str) – GitHub repository URL (e.g., "https://github.com/owner/repo")

Returns:

README content as a string, or None on failure.

Return type:

str | None

async fetch_url_content(url)[source]

Fetch text content from a URL.

Parameters:

url (str) – URL to fetch.

Returns:

Response text, or None on failure.

Return type:

str | None

generate_server_config(server_name)[source]

Generate an MCP server configuration from a server in the database.

Returns a config dict that can be used with: - haive-mcp MCPServerConfig - Claude Desktop mcp.json - langchain-mcp-adapters MultiServerMCPClient

Parameters:

server_name (str) – Server name (exact or partial match).

Returns:

Config dict with command, args, transport, env fields, or None if server not found.

Return type:

dict[str, Any] | None

get_enriched_server(server_name)[source]

Get a server enriched with data from its individual document file.

The individual document files in data/mcp_servers/documents/ contain full README content, descriptions, stars, and other metadata not present in the lightweight index.

Parameters:

server_name (str) – Server name (exact or partial match).

Returns:

Enriched server dict with readme_content, description, install_command etc., or None if not found.

Return type:

dict[str, Any] | None

get_server_documentation(server_name)[source]

Get documentation for a specific MCP server.

Parameters:

server_name (str) – Server name (e.g., "modelcontextprotocol/server-filesystem")

Return type:

dict[str, Any] | None

load_all_mcp_documents()[source]

Load all MCP server documentation from the stored JSON.

Tries multiple data files in order of preference: 1. ALL_MCP_SERVERS_COMPLETE.json (full database) 2. organized_servers.json (organized version) 3. all_mcp_documents.json (original)

Returns:

Dictionary mapping server names to documentation dictionaries.

Return type:

dict[str, dict[str, Any]]

search_servers_by_capability(capability)[source]

Search for MCP servers by capability in name or description.

Parameters:

capability (str) – Capability keyword to search for.

Return type:

list[dict[str, Any]]

search_servers_by_category(category)[source]

Search for MCP servers by category.

Parameters:

category (str) – Category to search for (e.g., "database", "filesystem")

Return type:

list[dict[str, Any]]