Zep is an open source platform for productionizing LLM apps. Go from a prototype built in LangChain or LlamaIndex, or a custom app, to production in minutes without rewriting code.
- Fast! Zep operates independently of the your chat loop, ensuring a snappy user experience.
- Chat History Memory, Archival, and Enrichment, populate your prompts with relevant chat history, sumamries, named entities, intent data, and more.
- Vector Search over Chat History and Documents Automatic embedding of documents, chat histories, and summaries. Use Zep's similarity or native MMR Re-ranked search to find the most relevant.
- Manage Users and their Chat Sessions Users and their Chat Sessions are first-class citizens in Zep, allowing you to manage user interactions with your bots or agents easily.
- Records Retention and Privacy Compliance Comply with corporate and regulatory mandates for records retention while ensuring compliance with privacy regulations such as CCPA and GDPR. Fulfill Right To Be Forgotten requests with a single API call
Installation and Setup
Install the Zep service. See the Zep Quick Start Guide.
Install the Zep Python SDK:
pip install zep_python
Zep's Memory API persists your app's chat history and metadata to a Session, enriches the memory, automatically generates summaries, and enables vector similarity search over historical chat messages and summaries.
There are two approaches to populating your prompt with chat history:
- Retrieve the most recent N messages (and potentionally a summary) from a Session and use them to construct your prompt.
- Search over the Session's chat history for messages that are relevant and use them to construct your prompt.
Both of these approaches may be useful, with the first providing the LLM with context as to the most recent interactions with a human. The second approach enables you to look back further in the chat history and retrieve messages that are relevant to the current conversation in a token-efficient manner.
from langchain.memory import ZepMemory
See a RAG App Example here.
Zep's Memory Retriever is a LangChain Retriever that enables you to retrieve messages from a Zep Session and use them to construct your prompt.
The Retriever supports searching over both individual messages and summaries of conversations. The latter is useful for providing rich, but succinct context to the LLM as to relevant past conversations.
Zep's Memory Retriever supports both similarity search and Maximum Marginal Relevance (MMR) reranking. MMR search is useful for ensuring that the retrieved messages are diverse and not too similar to each other
See a usage example.
from langchain.retrievers import ZepRetriever
Zep's Document VectorStore API enables you to store and retrieve documents using vector similarity search. Zep doesn't require you to understand distance functions, types of embeddings, or indexing best practices. You just pass in your chunked documents, and Zep handles the rest.
Zep supports both similarity search and Maximum Marginal Relevance (MMR) reranking. MMR search is useful for ensuring that the retrieved documents are diverse and not too similar to each other.
from langchain_community.vectorstores.zep import ZepVectorStore
See a usage example.