haive.agents.document_modifiers.base.state
==========================================

.. py:module:: haive.agents.document_modifiers.base.state

.. autoapi-nested-parse::

   Base state schema for document modification agents.

   from typing import Any
   This module defines the DocumentModifierState class which serves as the
   foundation for all document processing agents in the haive framework.


Classes
-------

.. autoapisummary::

   haive.agents.document_modifiers.base.state.DocumentModifierState


Module Contents
---------------

.. py:class:: DocumentModifierState(/, **data)

   Bases: :py:obj:`haive.core.schema.StateSchema`


   Base state schema for document modification agents.

   This class provides the core state management for all document processing
   operations. It handles document collections, provides computed properties
   for common operations, and includes validation to ensure data integrity.

   The state maintains a list of documents and provides utilities for:
   - Accessing combined document text
   - Counting documents
   - Adding/removing documents
   - Validating document collections

   .. attribute:: name

      Optional identifier for this document modifier instance.

   .. attribute:: description

      Optional description of the modifier's purpose.

   .. attribute:: documents

      List of Document objects to be processed.

   Properties:
       documents_text: Combined text content of all documents.
       num_documents: Total count of documents in the collection.

   .. rubric:: Example

   Creating and using document state::

       >>> from langchain_core.documents import Document
       >>> docs = [Document(page_content="Hello"), Document(page_content="World")]
       >>> state = DocumentModifierState.from_documents(docs)
       >>> print(state.documents_text)
       'Hello\\nWorld'
       >>> print(state.num_documents)
       2

   Adding documents dynamically::

       >>> new_doc = Document(page_content="New content")
       >>> state.documents.append(new_doc)
       >>> print(state.num_documents)
       3

   :raises ValueError: If no documents are provided (empty list).

   .. note::

      The state automatically validates that at least one document
      is present to prevent processing empty collections.

   Create a new model by parsing and validating input data from keyword arguments.

   Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be
   validated to form a valid model.

   `self` is explicitly positional-only to allow `self` as a field name.


   .. py:method:: add_document(document)
      :classmethod:


      Add a single document to the state.

      Note: This method has issues with the class method implementation.
      Consider using instance methods instead for document manipulation.

      :param document: Document to add to the collection.

      :returns: New state instance with the document added.


   .. py:method:: add_documents(documents)
      :classmethod:


      Add multiple documents to the state.

      Note: This method has issues with the class method implementation.
      Consider using instance methods instead for document manipulation.

      :param documents: List of documents to add.

      :returns: New state instance with documents added.


   .. py:method:: from_documents(documents)
      :classmethod:


      Create a DocumentModifierState from a list of documents.

      This is a convenience factory method for creating state instances
      when you already have a collection of documents.

      :param documents: List of Document objects to initialize the state with.

      :returns: New DocumentModifierState instance containing the provided documents.

      :raises ValueError: If the documents list is empty.

      .. rubric:: Example

      >>> docs = [Document(page_content="Content 1"), Document(page_content="Content 2")]
      >>> state = DocumentModifierState.from_documents(docs)
      >>> print(state.num_documents)
      2


   .. py:method:: remove_document(document)
      :classmethod:


      Remove a specific document from the state.

      Note: This method has issues with the class method implementation.
      Consider using instance methods instead for document manipulation.

      :param document: Document to remove from the collection.

      :returns: New state instance with the document removed.


   .. py:method:: remove_documents(documents)
      :classmethod:


      Remove multiple documents from the state.

      Note: This method has issues with the class method implementation.
      Consider using instance methods instead for document manipulation.

      :param documents: List of documents to remove.

      :returns: New state instance with documents removed.


   .. py:method:: validate_documents()

      Validate that at least one document is present.

      This validator runs after model initialization to ensure
      the state contains at least one document for processing.

      :returns: Self if validation passes.

      :raises ValueError: If documents list is empty.


   .. py:method:: validate_documents_field(v)
      :classmethod:


      Validate the documents field during assignment.

      :param v: The documents list being validated.

      :returns: The validated documents list.

      .. note::

         This validator ensures type safety but allows empty lists
         during field assignment. The model validator handles the
         non-empty requirement.


   .. py:property:: documents_text
      :type: str


      Get the combined text content of all documents.

      This property concatenates the page_content of all documents
      in the collection, separated by newlines. Useful for operations
      that need to process all document text at once.

      :returns: String containing all document texts joined by newlines.

      .. rubric:: Example

      >>> state.documents = [Document(page_content="First"), Document(page_content="Second")]
      >>> print(state.documents_text)
      'First\\nSecond'


   .. py:property:: num_documents
      :type: int


      Get the total number of documents in the collection.

      :returns: Integer count of documents currently in the state.

      .. rubric:: Example

      >>> print(f"Processing {state.num_documents} documents")
      Processing 5 documents