haive.core.engine.document.loaders.base_new¶

Base classes for document loaders.

This module provides the foundation for all document loaders in the system, including base source classes, pattern matching, and loader strategies. Kept under 300 lines as per code style guidelines.

Classes¶

BaseSource

Abstract base class for all document sources.

CloudSource

Base class for cloud storage sources.

DatabaseSource

Base class for database sources.

DirectorySource

Base class for directory sources.

LoaderQuality

Loader quality classification.

LoaderSpeed

Loader speed classification.

LoaderStrategy

Information about a specific loader strategy.

LocalSource

Base class for local file sources.

RemoteSource

Base class for remote sources with credential support.

SourcePattern

Pattern specification for source matching.

Functions¶

create_simple_loader(source_class, loader_class_name)

Helper to create a simple loader instance.

Module Contents¶

class haive.core.engine.document.loaders.base_new.BaseSource(/, **data)[source]¶

Bases: pydantic.BaseModel, abc.ABC

Abstract base class for all document sources.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

class Config[source]¶

Configuration for source classes.

abstractmethod create_loader(strategy=None, **kwargs)[source]¶

Create a document loader instance.

Parameters:
  • strategy (str | None) – Name of strategy to use

  • **kwargs – Additional loader arguments

Returns:

Configured document loader

Return type:

langchain_core.document_loaders.BaseLoader

get_best_strategy(preference=LoaderPreference.BALANCED)[source]¶

Get best strategy based on preference.

Parameters:

preference (haive.core.engine.document.config.LoaderPreference)

Return type:

LoaderStrategy | None

classmethod get_loader_strategies()[source]¶

Get available loader strategies.

Return type:

dict[str, LoaderStrategy]

classmethod get_patterns()[source]¶

Get all patterns for this source.

Return type:

list[SourcePattern]

class haive.core.engine.document.loaders.base_new.CloudSource(/, **data)[source]¶

Bases: RemoteSource

Base class for cloud storage sources.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

class haive.core.engine.document.loaders.base_new.DatabaseSource(/, **data)[source]¶

Bases: BaseSource, haive.core.common.mixins.secure_config.SecureConfigMixin

Base class for database sources.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

class haive.core.engine.document.loaders.base_new.DirectorySource(/, **data)[source]¶

Bases: LocalSource

Base class for directory sources.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

class haive.core.engine.document.loaders.base_new.LoaderQuality[source]¶

Bases: str, enum.Enum

Loader quality classification.

Initialize self. See help(type(self)) for accurate signature.

class haive.core.engine.document.loaders.base_new.LoaderSpeed[source]¶

Bases: str, enum.Enum

Loader speed classification.

Initialize self. See help(type(self)) for accurate signature.

class haive.core.engine.document.loaders.base_new.LoaderStrategy[source]¶

Information about a specific loader strategy.

class haive.core.engine.document.loaders.base_new.LocalSource(/, **data)[source]¶

Bases: BaseSource

Base class for local file sources.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

validate_file_exists()[source]¶

Check if file exists.

Return type:

bool

class haive.core.engine.document.loaders.base_new.RemoteSource(/, **data)[source]¶

Bases: BaseSource, haive.core.common.mixins.secure_config.SecureConfigMixin

Base class for remote sources with credential support.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

requires_authentication()[source]¶

Check if this source requires authentication.

Return type:

bool

class haive.core.engine.document.loaders.base_new.SourcePattern[source]¶

Pattern specification for source matching.

haive.core.engine.document.loaders.base_new.create_simple_loader(source_class, loader_class_name, module='langchain_community.document_loaders', **loader_kwargs)[source]¶

Helper to create a simple loader instance.

Parameters:
Return type:

langchain_core.document_loaders.BaseLoader