Atlas#

Atlas is a platform for interacting with both small and internet scale unstructured datasets by Nomic.

This notebook shows you how to use functionality related to the AtlasDB vectorstore.

!pip install spacy
!python3 -m spacy download en_core_web_sm
!pip install nomic
import time
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import SpacyTextSplitter
from langchain.vectorstores import AtlasDB
from langchain.document_loaders import TextLoader
ATLAS_TEST_API_KEY = '7xDPkYXSYDc1_ErdTPIcoAR9RNd8YDlkS3nVNXcVoIMZ6'
loader = TextLoader('../../../state_of_the_union.txt')
documents = loader.load()
text_splitter = SpacyTextSplitter(separator='|')
texts = []
for doc in text_splitter.split_documents(documents):
    texts.extend(doc.page_content.split('|'))
                 
texts = [e.strip() for e in texts]
db = AtlasDB.from_texts(texts=texts,
                        name='test_index_'+str(time.time()), # unique name for your vector store
                        description='test_index', #a description for your vector store
                        api_key=ATLAS_TEST_API_KEY,
                        index_kwargs={'build_topic_model': True})
db.project.wait_for_project_lock()
db.project
test_index_1677255228.136989
A description for your project 508 datums inserted.
1 index built.
Projections
  • test_index_1677255228.136989_index. Status Completed. view online

Projection ID: db996d77-8981-48a0-897a-ff2c22bbf541

Hide embedded project