3.1 KiB
MilvusLite Vector Store
This example demonstrates how to use MilvusLiteStore for vector storage and semantic search in AgentScope. It includes four test scenarios covering CRUD operations, metadata filtering, document chunking, and distance metrics.
Quick Start
Install agentscope first, and then the MilvusLite dependency:
# In MacOS/Linux
pip install pymilvus\[milvus_lite\]
# In Windows
pip install pymilvus[milvus_lite]
Run the example script, which showcases adding, searching with/without filters in MilvusLite vector store:
python milvuslite_store.py
Note: The script creates
.dbfiles in the current directory. You can delete them after testing.
Usage
Initialize Store
from agentscope.rag import MilvusLiteStore
store = MilvusLiteStore(
uri="./milvus_test.db",
collection_name="test_collection",
dimensions=768, # Match your embedding model
distance="COSINE", # COSINE, L2, or IP
)
Add Documents
from agentscope.rag import Document, DocMetadata
from agentscope.message import TextBlock
doc = Document(
metadata=DocMetadata(
content=TextBlock(type="text", text="Your document text"),
doc_id="doc_1",
chunk_id=0,
total_chunks=1,
),
embedding=[0.1, 0.2, ...], # Your embedding vector
)
await store.add([doc])
Search
results = await store.search(
query_embedding=[0.15, 0.25, ...],
limit=5,
score_threshold=0.9, # Optional
filter='doc_id like "prefix%"', # Optional
)
Delete
await store.delete(filter_expr='doc_id == "doc_1"')
Distance Metrics
| Metric | Description | Best For |
|---|---|---|
| COSINE | Cosine similarity | Text embeddings (recommended) |
| L2 | Euclidean distance | Spatial data |
| IP | Inner Product | Recommendation systems |
Filter Expressions
# Exact match
filter='doc_id == "doc_1"'
# Pattern matching
filter='doc_id like "prefix%"'
# Numeric and logical operators
filter='chunk_id >= 0 and total_chunks > 1'
Advanced Usage
Access Underlying Client
client = store.get_client()
stats = client.get_collection_stats(collection_name="test_collection")
Document Metadata
content: Text content (TextBlock)doc_id: Unique document identifierchunk_id: Chunk position (0-indexed)total_chunks: Total chunks in document
FAQ
What embedding dimension should I use? Match your embedding model's output dimension (e.g., 768 for BERT, 1536 for OpenAI ada-002).
Can I change the distance metric after creation? No, create a new collection with the desired metric.
How do I delete the database?
Delete the .db file specified in the uri parameter.
Is this suitable for production? MilvusLite works well for development and small-scale applications. For production at scale, consider Milvus standalone or cluster mode.