Skip to main content


Momento Cache is the world's first truly serverless caching service, offering instant elasticity, scale-to-zero capability, and blazing-fast performance.

Momento Vector Index stands out as the most productive, easiest-to-use, fully serverless vector index.

For both services, simply grab the SDK, obtain an API key, input a few lines into your code, and you're set to go. Together, they provide a comprehensive solution for your LLM data needs.

This page covers how to use the Momento ecosystem within LangChain.

Installation and Setup​

  • Sign up for a free account here to get an API key
  • Install the Momento Python SDK with pip install momento


Use Momento as a serverless, distributed, low-latency cache for LLM prompts and responses. The standard cache is the primary use case for Momento users in any environment.

To integrate Momento Cache into your application:

from langchain.cache import MomentoCache
API Reference:MomentoCache

Then, set it up with the following code:

from datetime import timedelta
from momento import CacheClient, Configurations, CredentialProvider
from langchain.globals import set_llm_cache

# Instantiate the Momento client
cache_client = CacheClient(

# Choose a Momento cache name of your choice
cache_name = "langchain"

# Instantiate the LLM cache
set_llm_cache(MomentoCache(cache_client, cache_name))
API Reference:set_llm_cache


Momento can be used as a distributed memory store for LLMs.

See this notebook for a walkthrough of how to use Momento as a memory store for chat message history.

from langchain.memory import MomentoChatMessageHistory

Vector Store​

Momento Vector Index (MVI) can be used as a vector store.

See this notebook for a walkthrough of how to use MVI as a vector store.

from langchain_community.vectorstores import MomentoVectorIndex
API Reference:MomentoVectorIndex

Was this page helpful?

You can also leave detailed feedback on GitHub.