haive.tools.tools.search_tools¶
Search Tools Module.
This module provides various search tools powered by the Tavily API and web scraping capabilities. It offers tools for question answering, web content extraction, context generation for RAG applications, and comprehensive search functionality with configurable parameters.
Examples
>>> from haive.tools.tools.search_tools import tavily_search_tool
>>> results = tavily_search_tool(query="What is quantum computing?")
>>> print(results)
Functions¶
|
Scrape web pages using WebBaseLoader to extract detailed content information. |
|
Extract raw content from a list of websites using the Tavily Extract API. |
|
Search tool for getting a quick answer to a specific question using Tavily's QnA. |
|
Generate search context for Retrieval Augmented Generation (RAG) applications. |
|
Query Tavily Search API with full configurability for comprehensive search. |
Module Contents¶
- haive.tools.tools.search_tools.scrape_webpages(urls)¶
Scrape web pages using WebBaseLoader to extract detailed content information.
This tool uses langchain’s WebBaseLoader to fetch and parse content from specified URLs, returning the extracted content in a formatted document structure.
- haive.tools.tools.search_tools.tavily_extract(urls, **kwargs)¶
Extract raw content from a list of websites using the Tavily Extract API.
This tool retrieves the content from specified URLs, which is useful for data collection, content analysis, and research. It can be combined with search methods to first find relevant documents and then extract detailed information from them.
- Parameters:
urls (List[str]) – The list of URLs to extract content from.
**kwargs – Additional arguments to pass to the Tavily Extract API.
- Returns:
The extracted content from the specified URLs in a structured format.
- Return type:
Dict
- Raises:
Exception – If the URL extraction fails or if the API request encounters an error.
- haive.tools.tools.search_tools.tavily_qna(query, max_results=5, include_answer=True, search_depth='advanced', verbose=False, topic='general', days=3, include_domains=[], exclude_domains=[])¶
Search tool for getting a quick answer to a specific question using Tavily’s QnA. search.
This tool queries the Tavily API with a question and returns a direct answer along with supporting information from web search results.
- Parameters:
query (str) – The search query or question to be answered.
max_results (int) – Maximum number of results to return. Default is 5.
include_answer (bool) – Include short answer in response. Default is True.
search_depth (Literal["basic", "advanced"]) – Search depth, either ‘basic’ or ‘advanced’. Default is ‘advanced’.
verbose (bool) – Log the tool’s progress. Default is False.
topic (Literal["general", "news", "finance"]) – Topic category for search context. Default is ‘general’.
days (int) – How recent the information should be in days. Default is 3.
include_domains (Sequence[str]) – Specific domains to include in search. Default is empty list.
exclude_domains (Sequence[str]) – Specific domains to exclude from search. Default is empty list.
- Returns:
The search results with a direct answer to the question.
- Return type:
- Raises:
Exception – If the Tavily API request fails.
- haive.tools.tools.search_tools.tavily_search_context(query, search_depth='basic', topic='general', days=3, max_results=5, include_domains=[], exclude_domains=[], max_tokens=4000, **kwargs)¶
Generate search context for Retrieval Augmented Generation (RAG) applications.
This tool retrieves relevant context information from the web based on a search query, specifically formatted for use in RAG applications. It provides more comprehensive context than standard search responses.
- Parameters:
query (str) – The search query string.
search_depth (Literal["basic", "advanced"]) – Search depth, either ‘basic’ or ‘advanced’. Default is ‘basic’.
topic (Literal["general", "news"]) – The topic category for search context. Default is ‘general’.
days (int) – How recent the information should be in days. Default is 3.
max_results (int) – Maximum number of results to return. Default is 5.
include_domains (Sequence[str]) – Specific domains to include in search. Default is empty list.
exclude_domains (Sequence[str]) – Specific domains to exclude from search. Default is empty list.
max_tokens (int) – Maximum number of tokens to return in the context. Default is 4000.
**kwargs – Additional arguments to pass to the Tavily API.
- Returns:
The search context formatted for RAG applications.
- Return type:
- Raises:
Exception – If the API request fails.
- haive.tools.tools.search_tools.tavily_search_tool(query, max_results=5, include_answer=True, include_raw_content=False, include_images=False, search_depth='advanced', include_domains=None, exclude_domains=None, verbose=False)¶
Query Tavily Search API with full configurability for comprehensive search. results.
This tool provides complete access to all Tavily search options and returns structured search results with customizable content types and filtering options.
- Parameters:
query (str) – The search query string.
max_results (Optional[int]) – Maximum number of results to return. Default is 5.
include_answer (Optional[bool]) – Include short answer in response. Default is True.
include_raw_content (Optional[bool]) – Include raw content of the search results. Default is False.
include_images (Optional[bool]) – Include images in the response. Default is False.
search_depth (Optional[str]) – Search depth, either ‘basic’ or ‘advanced’. Default is ‘advanced’.
include_domains (Optional[List[str]]) – Specific domains to include in search. Default is empty list.
exclude_domains (Optional[List[str]]) – Specific domains to exclude in search. Default is empty list.
verbose (Optional[bool]) – Log the tool’s progress. Default is False.
- Returns:
- The search results in a structured format including titles, URLs, and optionally
raw content, images, and direct answers.
- Return type:
Dict
- Raises:
Exception – If the API request fails or if invalid parameters are provided.