haive.agents.document_modifiers.tnt.agent
=========================================

.. py:module:: haive.agents.document_modifiers.tnt.agent

.. autoapi-nested-parse::

   Taxonomy generation agent implementation.

   from typing import Any, Dict, Optional
   This module implements an agent that generates taxonomies from conversation histories
   through an iterative process of document summarization, clustering, and refinement.
   It uses LLM-based processing at each step to generate high-quality taxonomies.

   The agent follows these main steps:
   1. Document summarization
   2. Minibatch creation
   3. Initial taxonomy generation
   4. Iterative taxonomy refinement
   5. Final taxonomy review

   .. rubric:: Examples

   Basic usage of the taxonomy agent::

       config = TaxonomyAgentConfig(
           state_schema=TaxonomyGenerationState,
           visualize=True,
           name="TaxonomyAgent"
       )
       agent = TaxonomyAgent(config)
       result = agent.run(input_data={"documents": [...]})


Classes
-------

.. autoapisummary::

   haive.agents.document_modifiers.tnt.agent.TaxonomyAgent
   haive.agents.document_modifiers.tnt.agent.TaxonomyAgentConfig


Module Contents
---------------

.. py:class:: TaxonomyAgent(config)

   Bases: :py:obj:`haive.core.engine.agent.agent.Agent`\ [\ :py:obj:`TaxonomyAgentConfig`\ ]


   Agent that generates a taxonomy from a conversation history.

   Initialize the taxonomy agent.


   .. py:method:: generate_taxonomy(state, config)

      Generates an initial taxonomy from the first document minibatch.

      :param state: The current state of the taxonomy process.
      :type state: TaxonomyGenerationState
      :param config: Configuration for the taxonomy generation.
      :type config: RunnableConfig

      :returns: Updated state with the initial taxonomy.
      :rtype: TaxonomyGenerationState


   .. py:method:: get_content(state)

      Extracts document content for processing.


   .. py:method:: get_minibatches(state, config)

      Splits documents into minibatches for iterative taxonomy generation.

      :param state: The current state containing documents.
      :type state: TaxonomyGenerationState
      :param config: Configuration object specifying batch size.
      :type config: RunnableConfig

      :returns: Dictionary with a 'minibatches' key containing grouped document indices.
      :rtype: dict


   .. py:method:: invoke_taxonomy_chain(chain_config, state, config, mb_indices)

      Invokes the taxonomy LLM to generate or refine taxonomies.

      :param chain: LLM pipeline for taxonomy generation.
      :type chain: Runnable
      :param state: Current taxonomy state.
      :type state: TaxonomyGenerationState
      :param config: Configurable parameters.
      :type config: RunnableConfig
      :param mb_indices: Indices of documents to process in this iteration.
      :type mb_indices: List[int]

      :returns: Updated state with new taxonomy clusters.
      :rtype: TaxonomyGenerationState


   .. py:method:: reduce_summaries(combined)

      Reduces summarized documents into a structured format.


   .. py:method:: review_taxonomy(state, config)

      Evaluates the final taxonomy after all updates.

      :param state: The current state with completed taxonomies.
      :type state: TaxonomyGenerationState
      :param config: Configuration settings.
      :type config: RunnableConfig

      :returns: Updated state with reviewed taxonomy.
      :rtype: TaxonomyGenerationState


   .. py:method:: setup_workflow()

      Sets up the taxonomy generation workflow in LangGraph.


   .. py:method:: update_taxonomy(state, config)

      Iteratively refines the taxonomy using new minibatches of data.

      :param state: The current state containing previous taxonomies.
      :type state: TaxonomyGenerationState
      :param config: Configuration settings.
      :type config: RunnableConfig

      :returns: Updated state with revised taxonomy clusters.
      :rtype: TaxonomyGenerationState


.. py:class:: TaxonomyAgentConfig

   Bases: :py:obj:`haive.core.engine.agent.agent.AgentConfig`


   Agent configuration for generating a taxonomy from conversation history.