haive.core.engine.vectorstore.VectorStoreConfig¶
- class haive.core.engine.vectorstore.VectorStoreConfig(*, id=<factory>, name=<factory>, engine_type=EngineType.VECTOR_STORE, description=None, input_schema=None, output_schema=None, version='1.0.0', metadata=<factory>, embedding_model=<factory>, vector_store_provider=VectorStoreProvider.FAISS, documents=<factory>, vector_store_path='vector_store', docstore_path='docstore', k=4, score_threshold=None, search_type='similarity', vector_store_kwargs=<factory>)[source]¶
Configuration model for a vector store engine.
VectorStoreConfig provides a consistent interface for creating and using vector stores with embeddings. It encapsulates all the configuration needed to create and interact with various vector store backends, abstracting away provider-specific implementation details.
This class enables: 1. Creating vector stores with various providers (FAISS, Chroma, Pinecone, etc.) 2. Managing documents and embeddings for vector storage 3. Performing similarity searches with configurable parameters 4. Creating retrievers that can be used in retrieval chains
- Parameters:
id (str)
name (str)
engine_type (EngineType)
description (str | None)
input_schema (type[BaseModel] | None)
output_schema (type[BaseModel] | None)
version (str)
embedding_model (BaseEmbeddingConfig)
vector_store_provider (VectorStoreProvider)
documents (list[Document])
vector_store_path (str)
docstore_path (str)
k (int)
score_threshold (float | None)
search_type (str)
- engine_type¶
The type of engine (always VECTOR_STORE).
- Type:
- embedding_model¶
Configuration for the embedding model.
- Type:
- vector_store_provider¶
The vector store provider to use.
- Type:
- documents¶
Documents to store in the vector store.
- Type:
List[Document]
Examples
>>> from haive.core.engine.vectorstore import VectorStoreConfig, VectorStoreProvider >>> from haive.core.models.embeddings.base import HuggingFaceEmbeddingConfig >>> from langchain_core.documents import Document >>> >>> # Create configuration >>> config = VectorStoreConfig( ... name="product_search", ... documents=[Document(page_content="iPhone 13: The latest smartphone from Apple")], ... vector_store_provider=VectorStoreProvider.FAISS, ... embedding_model=HuggingFaceEmbeddingConfig( ... model="sentence-transformers/all-MiniLM-L6-v2" ... ), ... k=5 ... ) >>> >>> # Create vector store >>> vectorstore = config.create_vectorstore() >>> >>> # Perform similarity search >>> results = config.similarity_search("smartphone", k=3) >>> >>> # Create a retriever >>> retriever = config.create_retriever(search_type="mmr")
- classmethod create_vs_config_from_documents(documents, embedding_model=None, **kwargs)[source]¶
Create a VectorStoreConfig from a list of documents.
- Parameters:
documents (list[Document]) – List of documents to include
embedding_model (BaseEmbeddingConfig | None) – Optional embedding model configuration
**kwargs – Additional parameters for the config
- Returns:
Configured VectorStoreConfig
- Return type:
- classmethod create_vs_from_documents(documents, embedding_model=None, **kwargs)[source]¶
Create a VectorStore from a list of documents.
- Parameters:
documents (list[Document]) – List of documents to include
embedding_model (BaseEmbeddingConfig | None) – Optional embedding model configuration
**kwargs – Additional parameters for the config
- Returns:
Instantiated VectorStore
- Return type:
VectorStore
- classmethod validate_engine_type(v)[source]¶
Validate Engine Type.
- Parameters:
v – [TODO: Add description]
- Returns:
Add return description]
- Return type:
[TODO
- add_document(document)[source]¶
Add a single document to the vector store config.
- Parameters:
document (Document) – Document to add
- Return type:
None
- add_documents(documents)[source]¶
Add multiple documents to the vector store config.
- Parameters:
documents (list[Document]) – List of documents to add
- Return type:
None
- create_retriever(search_type=None, search_kwargs=None, **kwargs)[source]¶
Create a retriever from the vector store.
- create_runnable(runnable_config=None)[source]¶
Create a vector store instance with configuration applied.
- Parameters:
runnable_config (RunnableConfig | None) – Optional runtime configuration
- Returns:
Instantiated vector store
- Return type:
VectorStore
- create_vectorstore(async_mode=False)[source]¶
Create a vector store instance from this configuration.
Instantiates a vector store of the configured provider type, using the documents and embedding model specified in the configuration. This method handles the details of creating the appropriate vector store class, initializing it with the correct parameters, and populating it with documents.
The method supports both synchronous and asynchronous initialization paths, and includes special handling for empty document collections.
- Parameters:
async_mode (bool) – Whether to use async methods for vector store creation. Default is False. If True, the method will use asynchronous variants of the vector store creation methods if available.
- Returns:
- An instantiated vector store of the configured provider type,
populated with the configured documents and using the specified embedding model.
- Return type:
VectorStore
- Raises:
ValueError – If an empty vector store cannot be created with the specified provider.
Examples
>>> config = VectorStoreConfig( ... name="product_catalog", ... vector_store_provider=VectorStoreProvider.FAISS, ... documents=[Document(page_content="Product description...")] ... ) >>> vectorstore = config.create_vectorstore() >>> >>> # With async mode >>> async def create_async(): ... return await config.create_vectorstore(async_mode=True)
- get_output_fields()[source]¶
Return output field definitions as field_name -> (type, default) pairs.
- get_vectorstore(embedding=None, async_mode=False)[source]¶
Get the vector store with optional embedding override.
- Parameters:
embedding – Optional embedding model override
async_mode (bool) – Whether to use async methods
- Returns:
Instantiated vector store
- Return type:
VectorStore
- similarity_search(query, k=None, score_threshold=None, filter=None, search_type=None, runnable_config=None)[source]¶
Perform similarity search with configurable parameters.
- Parameters:
query (str) – Query string
k (int | None) – Number of documents to retrieve (overrides default)
score_threshold (float | None) – Score threshold for filtering results
filter (dict[str, Any] | None) – Optional filter for the search
search_type (str | None) – Search type (similarity, mmr, etc.)
runnable_config (RunnableConfig | None) – Optional runtime configuration
- Returns:
List of retrieved documents
- Return type:
list[Document]
- embedding_model: BaseEmbeddingConfig¶
- engine_type: EngineType¶
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- vector_store_provider: VectorStoreProvider¶