haive.core.engine.document.loaders.source_base¶

Base classes for document sources.

This module provides base classes for different types of document sources. Sources represent the location/type of documents, while loaders handle the actual loading.

Classes¶

BaseSource

Abstract base class for all document sources.

CloudSource

Base class for cloud storage sources.

DatabaseSource

Base class for database sources.

DirectorySource

Source for directory of files.

LocalSource

Base class for local file sources.

RemoteSource

Base class for remote sources with credential support.

Module Contents¶

class haive.core.engine.document.loaders.source_base.BaseSource(/, **data)[source]¶

Bases: pydantic.BaseModel, abc.ABC

Abstract base class for all document sources.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

abstractmethod get_loader_kwargs()[source]¶

Get kwargs to pass to the loader.

Return type:

dict[str, Any]

abstractmethod validate_source()[source]¶

Validate that the source is accessible/valid.

Return type:

bool

class haive.core.engine.document.loaders.source_base.CloudSource(/, **data)[source]¶

Bases: RemoteSource

Base class for cloud storage sources.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

get_loader_kwargs()[source]¶

Get kwargs for cloud storage loaders.

Return type:

dict[str, Any]

class haive.core.engine.document.loaders.source_base.DatabaseSource(/, **data)[source]¶

Bases: BaseSource, haive.core.common.mixins.secure_config.SecureConfigMixin

Base class for database sources.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

get_loader_kwargs()[source]¶

Get kwargs for database loaders.

Return type:

dict[str, Any]

validate_source()[source]¶

Basic validation of connection string.

Return type:

bool

class haive.core.engine.document.loaders.source_base.DirectorySource(/, **data)[source]¶

Bases: LocalSource

Source for directory of files.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

get_loader_kwargs()[source]¶

Get kwargs for directory loaders.

Return type:

dict[str, Any]

validate_source()[source]¶

Check if directory exists.

Return type:

bool

class haive.core.engine.document.loaders.source_base.LocalSource(/, **data)[source]¶

Bases: BaseSource

Base class for local file sources.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

get_loader_kwargs()[source]¶

Get kwargs for local file loaders.

Return type:

dict[str, Any]

validate_source()[source]¶

Check if file exists.

Return type:

bool

class haive.core.engine.document.loaders.source_base.RemoteSource(/, **data)[source]¶

Bases: BaseSource, haive.core.common.mixins.secure_config.SecureConfigMixin

Base class for remote sources with credential support.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

get_loader_kwargs()[source]¶

Get kwargs for remote loaders.

Return type:

dict[str, Any]

validate_source()[source]¶

Validate URL format.

Return type:

bool