haive.agents.document_modifiers.base.state ========================================== .. py:module:: haive.agents.document_modifiers.base.state .. autoapi-nested-parse:: Base state schema for document modification agents. from typing import Any This module defines the DocumentModifierState class which serves as the foundation for all document processing agents in the haive framework. Classes ------- .. autoapisummary:: haive.agents.document_modifiers.base.state.DocumentModifierState Module Contents --------------- .. py:class:: DocumentModifierState(/, **data) Bases: :py:obj:`haive.core.schema.StateSchema` Base state schema for document modification agents. This class provides the core state management for all document processing operations. It handles document collections, provides computed properties for common operations, and includes validation to ensure data integrity. The state maintains a list of documents and provides utilities for: - Accessing combined document text - Counting documents - Adding/removing documents - Validating document collections .. attribute:: name Optional identifier for this document modifier instance. .. attribute:: description Optional description of the modifier's purpose. .. attribute:: documents List of Document objects to be processed. Properties: documents_text: Combined text content of all documents. num_documents: Total count of documents in the collection. .. rubric:: Example Creating and using document state:: >>> from langchain_core.documents import Document >>> docs = [Document(page_content="Hello"), Document(page_content="World")] >>> state = DocumentModifierState.from_documents(docs) >>> print(state.documents_text) 'Hello\\nWorld' >>> print(state.num_documents) 2 Adding documents dynamically:: >>> new_doc = Document(page_content="New content") >>> state.documents.append(new_doc) >>> print(state.num_documents) 3 :raises ValueError: If no documents are provided (empty list). .. note:: The state automatically validates that at least one document is present to prevent processing empty collections. Create a new model by parsing and validating input data from keyword arguments. Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. `self` is explicitly positional-only to allow `self` as a field name. .. py:method:: add_document(document) :classmethod: Add a single document to the state. Note: This method has issues with the class method implementation. Consider using instance methods instead for document manipulation. :param document: Document to add to the collection. :returns: New state instance with the document added. .. py:method:: add_documents(documents) :classmethod: Add multiple documents to the state. Note: This method has issues with the class method implementation. Consider using instance methods instead for document manipulation. :param documents: List of documents to add. :returns: New state instance with documents added. .. py:method:: from_documents(documents) :classmethod: Create a DocumentModifierState from a list of documents. This is a convenience factory method for creating state instances when you already have a collection of documents. :param documents: List of Document objects to initialize the state with. :returns: New DocumentModifierState instance containing the provided documents. :raises ValueError: If the documents list is empty. .. rubric:: Example >>> docs = [Document(page_content="Content 1"), Document(page_content="Content 2")] >>> state = DocumentModifierState.from_documents(docs) >>> print(state.num_documents) 2 .. py:method:: remove_document(document) :classmethod: Remove a specific document from the state. Note: This method has issues with the class method implementation. Consider using instance methods instead for document manipulation. :param document: Document to remove from the collection. :returns: New state instance with the document removed. .. py:method:: remove_documents(documents) :classmethod: Remove multiple documents from the state. Note: This method has issues with the class method implementation. Consider using instance methods instead for document manipulation. :param documents: List of documents to remove. :returns: New state instance with documents removed. .. py:method:: validate_documents() Validate that at least one document is present. This validator runs after model initialization to ensure the state contains at least one document for processing. :returns: Self if validation passes. :raises ValueError: If documents list is empty. .. py:method:: validate_documents_field(v) :classmethod: Validate the documents field during assignment. :param v: The documents list being validated. :returns: The validated documents list. .. note:: This validator ensures type safety but allows empty lists during field assignment. The model validator handles the non-empty requirement. .. py:property:: documents_text :type: str Get the combined text content of all documents. This property concatenates the page_content of all documents in the collection, separated by newlines. Useful for operations that need to process all document text at once. :returns: String containing all document texts joined by newlines. .. rubric:: Example >>> state.documents = [Document(page_content="First"), Document(page_content="Second")] >>> print(state.documents_text) 'First\\nSecond' .. py:property:: num_documents :type: int Get the total number of documents in the collection. :returns: Integer count of documents currently in the state. .. rubric:: Example >>> print(f"Processing {state.num_documents} documents") Processing 5 documents