haive.core.engine.document.agents¶

Document Engine Agent Implementation.

This module provides agent implementations that integrate with the DocumentEngine and the Haive agent framework for loading documents from various sources.

The agents handle document loading from various sources, including: - Local files and directories - Web pages and URLs - Cloud storage - Text input

The agents can be integrated into complex workflows and support both synchronous and asynchronous operation modes.

Classes¶

DirectoryDocumentAgent

Specialized document agent for loading documents from directories.

DocumentAgent

Document Agent that integrates the document engine with the agent framework.

FileDocumentAgent

Specialized document agent for loading documents from files.

WebDocumentAgent

Specialized document agent for loading documents from web URLs.

Module Contents¶

class haive.core.engine.document.agents.DirectoryDocumentAgent(/, **data)¶

Bases: DocumentAgent

Specialized document agent for loading documents from directories.

This agent is pre-configured for loading from local directories and provides additional directory-specific options.

Parameters:

data (Any)

name¶

Name of the agent

directory_path¶

Path to the directory to load

recursive¶

Whether to recursively load files

include_patterns¶

List of file patterns to include

exclude_patterns¶

List of file patterns to exclude

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

setup_agent()¶

Set up the agent with a directory document engine.

Return type:

None

class haive.core.engine.document.agents.DocumentAgent(/, **data)¶

Bases: haive.agents.base.agent.Agent

Document Agent that integrates the document engine with the agent framework.

This agent provides a simple interface for loading and processing documents from various sources through the agent framework. It can be used as a standalone agent or as part of a more complex agent workflow.

The agent supports loading from: - Local files and directories - Web pages and URLs - Text input - Cloud storage (with proper credentials)

Parameters:

data (Any)

name¶

Name of the agent

engine¶

The document engine to use

include_content¶

Whether to include document content in the output

include_metadata¶

Whether to include document metadata in the output

max_documents¶

Maximum number of documents to load (None for unlimited)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

build_graph()¶

Build the document agent graph.

Creates a simple linear graph that loads and processes documents from the input source.

Returns:

A BaseGraph instance for document processing

Return type:

haive.core.graph.state_graph.base_graph2.BaseGraph

process_output(output)¶

Process the output from the document engine.

This method filters and formats the output based on the agent’s configuration.

Parameters:

output (haive.core.engine.document.config.DocumentOutput) – The raw output from the document engine

Returns:

A dictionary with processed document data

Return type:

dict[str, Any]

setup_agent()¶

Set up the agent by configuring the document engine.

This method is called during agent initialization to set up the engine with the agent’s configuration parameters.

Return type:

None

class haive.core.engine.document.agents.FileDocumentAgent(/, **data)¶

Bases: DocumentAgent

Specialized document agent for loading documents from files.

This agent is pre-configured for loading from local files and provides additional file-specific options.

Parameters:

data (Any)

name¶

Name of the agent

file_path¶

Path to the file to load

chunking_strategy¶

Strategy for chunking documents

chunk_size¶

Size of chunks in characters

chunk_overlap¶

Overlap between chunks

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

setup_agent()¶

Set up the agent with a file document engine.

Return type:

None

class haive.core.engine.document.agents.WebDocumentAgent(/, **data)¶

Bases: DocumentAgent

Specialized document agent for loading documents from web URLs.

This agent is pre-configured for loading from web sources and provides additional web-specific options.

Parameters:

data (Any)

name¶

Name of the agent

url¶

URL to load

chunking_strategy¶

Strategy for chunking documents

chunk_size¶

Size of chunks in characters

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

setup_agent()¶

Set up the agent with a web document engine.

Return type:

None