haive.core.engine.retriever.providers.PubMedRetrieverConfig¶

PubMed Retriever implementation for the Haive framework.

from typing import Any This module provides a configuration class for the PubMed retriever, which retrieves biomedical and life science literature from the PubMed database. PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics.

The PubMedRetriever works by: 1. Connecting to the PubMed API (via NCBI E-utilities) 2. Executing search queries against the PubMed database 3. Retrieving article abstracts and metadata 4. Returning formatted documents with biomedical literature

This retriever is particularly useful when: - Building medical or healthcare applications - Researching biomedical topics and treatments - Creating evidence-based medicine tools - Developing clinical decision support systems - Building scientific literature review applications

The implementation integrates with LangChain’s PubMedRetriever while providing a consistent Haive configuration interface.

Classes¶

PubMedRetrieverConfig

Configuration for PubMed retriever in the Haive framework.

Module Contents¶

class haive.core.engine.retriever.providers.PubMedRetrieverConfig.PubMedRetrieverConfig[source]¶

Bases: haive.core.engine.retriever.retriever.BaseRetrieverConfig

Configuration for PubMed retriever in the Haive framework.

This retriever searches the PubMed database for biomedical literature and returns article abstracts and metadata as documents.

retriever_type¶

The type of retriever (always PUBMED).

Type:: RetrieverType

top_k_results¶

Number of articles to retrieve (default: 3).

Type:: int

load_max_docs¶

Maximum number of documents to load (default: 25).

Type:: int

load_all_available_meta¶

Whether to load all available metadata.

Type:: bool

doc_content_chars_max¶

Maximum characters per document.

Type:: int

email¶

Email for NCBI API (recommended for higher rate limits).

Type:: Optional[str]

Examples

>>> from haive.core.engine.retriever import PubMedRetrieverConfig
>>>
>>> # Create the PubMed retriever config
>>> config = PubMedRetrieverConfig(
...     name="pubmed_retriever",
...     top_k_results=5,
...     load_max_docs=20,
...     load_all_available_meta=True,
...     email="researcher@university.edu"  # Optional but recommended
... )
>>>
>>> # Instantiate and use the retriever
>>> retriever = config.instantiate()
>>> docs = retriever.get_relevant_documents("COVID-19 vaccine effectiveness")
>>>
>>> # Example with specific medical query
>>> docs = retriever.get_relevant_documents("CRISPR gene editing cancer treatment")

get_input_fields()[source]¶

Return input field definitions for PubMed retriever.

Return type:: dict[str, tuple[type, Any]]

get_output_fields()[source]¶

Return output field definitions for PubMed retriever.

Return type:: dict[str, tuple[type, Any]]

instantiate()[source]¶

Create a PubMed retriever from this configuration.

Returns:: Instantiated retriever ready for biomedical literature search.
Return type:: PubMedRetriever
Raises:: ImportError – If required packages are not available.