prebuilt.company_researcher.utils¶

Functions¶

deduplicate_sources(→ list[dict])

Takes either a single search response or list of responses from Tavily API and de-duplicates them based on the URL.

format_all_notes(→ str)

Format a list of notes into a string.

format_sources(→ str)

Takes a list of unique results from Tavily API and formats them.

Module Contents¶

prebuilt.company_researcher.utils.deduplicate_sources(search_response: dict | list[dict]) list[dict]¶

Takes either a single search response or list of responses from Tavily API and de-duplicates them based on the URL.

Parameters:

search_response – Either: - A dict with a ‘results’ key containing a list of search results - A list of dicts, each containing search results

Returns:

Formatted string with deduplicated sources

Return type:

str

prebuilt.company_researcher.utils.format_all_notes(completed_notes: list[str]) str¶

Format a list of notes into a string.

prebuilt.company_researcher.utils.format_sources(sources_list: list[dict], include_raw_content: bool = True, max_tokens_per_source: int = 1000) str¶

Takes a list of unique results from Tavily API and formats them. Limits the raw_content to approximately max_tokens_per_source. include_raw_content specifies whether to include the raw_content from Tavily in the formatted string.

Parameters:
  • sources_list – list of unique results from Tavily API

  • max_tokens_per_source – int, maximum number of tokens per each search result to include in the formatted string

  • include_raw_content – bool, whether to include the raw_content from Tavily in the formatted string

Returns:

Formatted string with deduplicated sources

Return type:

str