haive.core.engine.document.loaders.auto_registry¶
Auto-Registry System for Document Loaders.
from typing import Any This module provides automatic registration and discovery of all document loader sources and loaders. It scans the sources directory and automatically imports and registers all available source types without manual intervention.
The auto-registry ensures that all 230+ implemented loaders are automatically available when the system starts, providing a seamless developer experience.
Examples
Auto-register all sources:
from haive.core.engine.document.loaders import auto_register_all
# Automatically discover and register all sources
auto_register_all()
Check registration status:
from haive.core.engine.document.loaders import get_registration_status
status = get_registration_status()
print(f"Registered {status['total_sources']} sources")
Author: Claude (Haive Document Loader System) Version: 1.0.0
Classes¶
Automatic registry for document loader sources. |
|
Information about a registered source. |
|
Statistics about the registration process. |
Functions¶
Convenience function to auto-register all sources. |
|
Get current registration status. |
|
|
Get sources for a specific category. |
List all available source types. |
Module Contents¶
- class haive.core.engine.document.loaders.auto_registry.AutoRegistry(registry=None)[source]¶
Automatic registry for document loader sources.
The AutoRegistry scans the sources directory and automatically discovers, imports, and registers all available source types. This eliminates the need for manual registration and ensures all implemented loaders are available.
- Features:
Automatic module discovery and import
Source class detection and validation
Duplicate registration prevention
Error handling and reporting
Registration statistics and monitoring
Dependency tracking
Examples
Basic auto-registration:
registry = AutoRegistry() stats = registry.register_all_sources() print(f"Registered {stats.total_sources_registered} sources")
With custom filters:
registry = AutoRegistry() stats = registry.register_sources_by_category(SourceCategory.LOCAL_FILE)
Initialize the AutoRegistry.
- Parameters:
registry – Optional custom registry instance
- discover_source_modules()[source]¶
Discover all source modules in the sources directory.
Examples
Find all source modules:
registry = AutoRegistry() modules = registry.discover_source_modules() print(f"Found {len(modules)} source modules")
- find_source_classes(module)[source]¶
Find all source classes in a module.
- Parameters:
module (Any) – Imported module to scan
- Returns:
List of (class_name, class_type) tuples
- Return type:
list[tuple[str, type[haive.core.engine.document.loaders.sources.source_types.BaseSource]]]
Examples
Find sources in module:
registry = AutoRegistry() module = registry.import_source_module("...") classes = registry.find_source_classes(module) print(f"Found {len(classes)} source classes")
- get_registration_status()[source]¶
Get current registration status and statistics.
Examples
Check registration status:
registry = AutoRegistry() status = registry.get_registration_status() print(f"Total sources: {status['total_sources']}") print(f"Categories: {status['categories_count']}") print(f"Recent registrations: {status['recent_registrations']}")
- get_source_info(source_name)[source]¶
Get detailed information about a registered source.
- Parameters:
source_name (str) – Name of the source to get info for
- Returns:
RegistrationInfo or None if not found
- Return type:
RegistrationInfo | None
Examples
Get source details:
registry = AutoRegistry() info = registry.get_source_info("pdf") if info: print(f"Module: {info.module_name}") print(f"Loaders: {info.loaders}")
- import_source_module(module_name)[source]¶
Import a source module safely.
- Parameters:
module_name (str) – Full module name to import
- Returns:
Imported module or None if import failed
- Return type:
Any | None
Examples
Import specific module:
registry = AutoRegistry() module = registry.import_source_module( "haive.core.engine.document.loaders.sources.file_sources" )
- list_sources_by_category()[source]¶
List all registered sources grouped by category.
- Returns:
Dictionary mapping categories to source lists
- Return type:
dict[haive.core.engine.document.loaders.sources.source_types.SourceCategory, list[str]]
Examples
List sources by category:
registry = AutoRegistry() by_category = registry.list_sources_by_category() for category, sources in by_category.items(): print(f"{category.value}: {', '.join(sources)}")
- register_all_sources()[source]¶
Register all discovered sources automatically.
- Returns:
RegistrationStats with detailed information about the process
- Return type:
Examples
Auto-register everything:
registry = AutoRegistry() stats = registry.register_all_sources() print(f"Scanned: {stats.total_modules_scanned} modules") print(f"Found: {stats.total_sources_found} sources") print(f"Registered: {stats.total_sources_registered} sources") print(f"Errors: {len(stats.registration_errors)}")
- register_module_sources(module_name)[source]¶
Register all sources from a specific module.
- Parameters:
module_name (str) – Module name to process
- Returns:
Number of sources registered from this module
- Return type:
Examples
Register all sources from file_sources module:
registry = AutoRegistry() count = registry.register_module_sources( "haive.core.engine.document.loaders.sources.file_sources" ) print(f"Registered {count} sources")
- register_source_class(source_name, source_class, module_name)[source]¶
Register a single source class.
- Parameters:
source_name (str) – Name to register the source under
source_class (type[haive.core.engine.document.loaders.sources.source_types.BaseSource]) – Source class to register
module_name (str) – Module where the source is defined
- Returns:
True if registration was successful
- Return type:
Examples
Register single source:
registry = AutoRegistry() success = registry.register_source_class( "pdf", PDFSource, "file_sources" )
- register_sources_by_category(category)[source]¶
Register sources from a specific category only.
- Parameters:
category (haive.core.engine.document.loaders.sources.source_types.SourceCategory) – SourceCategory to register
- Returns:
Number of sources registered
- Return type:
Examples
Register only file sources:
registry = AutoRegistry() count = registry.register_sources_by_category(SourceCategory.LOCAL_FILE) print(f"Registered {count} file sources")
- validate_all_registrations()[source]¶
Validate all registered sources.
Examples
Validate registrations:
registry = AutoRegistry() report = registry.validate_all_registrations() print(f"Valid: {report['valid_count']}") print(f"Invalid: {report['invalid_count']}")
- validate_source_class(source_class)[source]¶
Validate that a source class is properly configured.
- Parameters:
source_class (type[haive.core.engine.document.loaders.sources.source_types.BaseSource]) – Source class to validate
- Returns:
True if source class is valid
- Return type:
Examples
Validate source class:
registry = AutoRegistry() valid = registry.validate_source_class(PDFSource) print(f"Source valid: {valid}")
- class haive.core.engine.document.loaders.auto_registry.RegistrationInfo[source]¶
Information about a registered source.
- source_name¶
Name of the source type
- source_class¶
The source class
- module_name¶
Module where source is defined
- category¶
Source category
- loaders¶
Available loaders for this source
- registration_time¶
When the source was registered
- class haive.core.engine.document.loaders.auto_registry.RegistrationStats[source]¶
Statistics about the registration process.
- total_modules_scanned¶
Number of modules scanned
- total_sources_found¶
Number of source classes found
- total_sources_registered¶
Number of sources successfully registered
- registration_errors¶
List of errors encountered
- registration_time¶
Total time taken for registration
- categories_covered¶
Number of categories with registered sources
- haive.core.engine.document.loaders.auto_registry.auto_register_all()[source]¶
Convenience function to auto-register all sources.
- Returns:
RegistrationStats with detailed information
- Return type:
Examples
Auto-register everything:
from haive.core.engine.document.loaders import auto_register_all stats = auto_register_all() print(f"Registered {stats.total_sources_registered} sources")
- haive.core.engine.document.loaders.auto_registry.get_registration_status()[source]¶
Get current registration status.
Examples
Check status:
from haive.core.engine.document.loaders import get_registration_status status = get_registration_status() print(f"Total sources: {status['total_sources']}")
- haive.core.engine.document.loaders.auto_registry.get_sources_by_category(category)[source]¶
Get sources for a specific category.
- Parameters:
category (haive.core.engine.document.loaders.sources.source_types.SourceCategory) – SourceCategory to filter by
- Returns:
List of source names in the category
- Return type:
Examples
Get file sources:
from haive.core.engine.document.loaders import get_sources_by_category from haive.core.engine.document.loaders.sources.source_types import SourceCategory file_sources = get_sources_by_category(SourceCategory.LOCAL_FILE) print(f"File sources: {file_sources}")