TablestoreVectorStore
Tablestore is a fully managed NoSQL cloud database service that enables storage of a massive amount of structured and semi-structured data.
This notebook shows how to use functionality related to the Tablestore
vector database.
To use Tablestore, you must create an instance. Here are the creating instance instructions.
Setup
%pip install --upgrade --quiet langchain-community tablestore
Initialization
import getpass
import os
os.environ["end_point"] = getpass.getpass("Tablestore end_point:")
os.environ["instance_name"] = getpass.getpass("Tablestore instance_name:")
os.environ["access_key_id"] = getpass.getpass("Tablestore access_key_id:")
os.environ["access_key_secret"] = getpass.getpass("Tablestore access_key_secret:")
Create vector store.
import tablestore
from langchain_community.embeddings import FakeEmbeddings
from langchain_community.vectorstores import TablestoreVectorStore
from langchain_core.documents import Document
test_embedding_dimension_size = 4
embeddings = FakeEmbeddings(size=test_embedding_dimension_size)
store = TablestoreVectorStore(
embedding=embeddings,
endpoint=os.getenv("end_point"),
instance_name=os.getenv("instance_name"),
access_key_id=os.getenv("access_key_id"),
access_key_secret=os.getenv("access_key_secret"),
vector_dimension=test_embedding_dimension_size,
# metadata mapping is used to filter non-vector fields.
metadata_mappings=[
tablestore.FieldSchema(
"type", tablestore.FieldType.KEYWORD, index=True, enable_sort_and_agg=True
),
tablestore.FieldSchema(
"time", tablestore.FieldType.LONG, index=True, enable_sort_and_agg=True
),
],
)
Manage vector store
Create table and index.
store.create_table_if_not_exist()
store.create_search_index_if_not_exist()
Add documents.
store.add_documents(
[
Document(
id="1", page_content="1 hello world", metadata={"type": "pc", "time": 2000}
),
Document(
id="2", page_content="abc world", metadata={"type": "pc", "time": 2009}
),
Document(
id="3", page_content="3 text world", metadata={"type": "sky", "time": 2010}
),
Document(
id="4", page_content="hi world", metadata={"type": "sky", "time": 2030}
),
Document(
id="5", page_content="hi world", metadata={"type": "sky", "time": 2030}
),
]
)
['1', '2', '3', '4', '5']
Delete document.
store.delete(["3"])
True
Get documents.
Query vector store
store.get_by_ids(["1", "3", "5"])
[Document(id='1', metadata={'embedding': '[1.3296732307905934, 0.0037521341868022385, 0.9821875819319514, 2.5644103644492393]', 'time': 2000, 'type': 'pc'}, page_content='1 hello world'),
None,
Document(id='5', metadata={'embedding': '[1.4558082172139821, -1.6441137122167426, -0.13113098640337423, -1.889685473174525]', 'time': 2030, 'type': 'sky'}, page_content='hi world')]
Similarity search.
store.similarity_search(query="hello world", k=2)
[Document(id='1', metadata={'embedding': [1.3296732307905934, 0.0037521341868022385, 0.9821875819319514, 2.5644103644492393], 'time': 2000, 'type': 'pc'}, page_content='1 hello world'),
Document(id='4', metadata={'embedding': [-0.3310144199800685, 0.29250046478723635, -0.0646862290377582, -0.23664360156781225], 'time': 2030, 'type': 'sky'}, page_content='hi world')]
Similarity search with filters.
store.similarity_search(
query="hello world",
k=10,
tablestore_filter_query=tablestore.BoolQuery(
must_queries=[tablestore.TermQuery(field_name="type", column_value="sky")],
should_queries=[tablestore.RangeQuery(field_name="time", range_from=2020)],
must_not_queries=[tablestore.TermQuery(field_name="type", column_value="pc")],
),
)
[Document(id='5', metadata={'embedding': [1.4558082172139821, -1.6441137122167426, -0.13113098640337423, -1.889685473174525], 'time': 2030, 'type': 'sky'}, page_content='hi world'),
Document(id='4', metadata={'embedding': [-0.3310144199800685, 0.29250046478723635, -0.0646862290377582, -0.23664360156781225], 'time': 2030, 'type': 'sky'}, page_content='hi world')]
Usage for retrieval-augmented generation
For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:
API reference
For detailed documentation of all TablestoreVectorStore
features and configurations head to the API reference:
https://python.langchain.com/api_reference/community/vectorstores/langchain_community.vectorstores.tablestore.TablestoreVectorStore.html
Related
- Vector store conceptual guide
- Vector store how-to guides