Retrievers#

pydantic model langchain.retrievers.ArxivRetriever[source]#

It is effectively a wrapper for ArxivAPIWrapper. It wraps load() to get_relevant_documents(). It uses all ArxivAPIWrapper arguments without any change.

async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.AzureCognitiveSearchRetriever[source]#

Wrapper around Azure Cognitive Search.

field aiosession: Optional[aiohttp.client.ClientSession] = None#

ClientSession, in case we want to reuse connection for better performance.

field api_key: str = ''#

API Key. Both Admin and Query keys work, but for reading data it’s recommended to use a Query key.

field api_version: str = '2020-06-30'#

API version

field content_key: str = 'content'#

Key in a retrieved result to set as the Document page_content.

field index_name: str = ''#

Name of Index inside Azure Cognitive Search service

field service_name: str = ''#

Name of Azure Cognitive Search service

async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.ChatGPTPluginRetriever[source]#
field aiosession: Optional[aiohttp.client.ClientSession] = None#
field bearer_token: str [Required]#
field filter: Optional[dict] = None#
field top_k: int = 3#
field url: str [Required]#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.ContextualCompressionRetriever[source]#

Retriever that wraps a base retriever and compresses the results.

field base_compressor: langchain.retrievers.document_compressors.base.BaseDocumentCompressor [Required]#

Compressor for compressing retrieved documents.

field base_retriever: langchain.schema.BaseRetriever [Required]#

Base Retriever to use for getting relevant documents.

async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

Sequence of relevant documents

class langchain.retrievers.DataberryRetriever(datastore_url: str, top_k: Optional[int] = None, api_key: Optional[str] = None)[source]#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

api_key: Optional[str]#
datastore_url: str#
get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

top_k: Optional[int]#
class langchain.retrievers.ElasticSearchBM25Retriever(client: Any, index_name: str)[source]#

Wrapper around Elasticsearch using BM25 as a retrieval method.

To connect to an Elasticsearch instance that requires login credentials, including Elastic Cloud, use the Elasticsearch URL format https://username:password@es_host:9243. For example, to connect to Elastic Cloud, create the Elasticsearch URL with the required authentication details and pass it to the ElasticVectorSearch constructor as the named parameter elasticsearch_url.

You can obtain your Elastic Cloud URL and login credentials by logging in to the Elastic Cloud console at https://cloud.elastic.co, selecting your deployment, and navigating to the “Deployments” page.

To obtain your Elastic Cloud password for the default “elastic” user:

  1. Log in to the Elastic Cloud console at https://cloud.elastic.co

  2. Go to “Security” > “Users”

  3. Locate the “elastic” user and click “Edit”

  4. Click “Reset password”

  5. Follow the prompts to reset the password

The format for Elastic Cloud URLs is https://username:password@cluster_id.region_id.gcp.cloud.es.io:9243.

add_texts(texts: Iterable[str], refresh_indices: bool = True) List[str][source]#

Run more texts through the embeddings and add to the retriver.

Parameters
  • texts – Iterable of strings to add to the retriever.

  • refresh_indices – bool to refresh ElasticSearch indices

Returns

List of ids from adding the texts into the retriever.

async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

classmethod create(elasticsearch_url: str, index_name: str, k1: float = 2.0, b: float = 0.75) langchain.retrievers.elastic_search_bm25.ElasticSearchBM25Retriever[source]#
get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.KNNRetriever[source]#
field embeddings: langchain.embeddings.base.Embeddings [Required]#
field index: Any = None#
field k: int = 4#
field relevancy_threshold: Optional[float] = None#
field texts: List[str] [Required]#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

classmethod from_texts(texts: List[str], embeddings: langchain.embeddings.base.Embeddings, **kwargs: Any) langchain.retrievers.knn.KNNRetriever[source]#
get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

class langchain.retrievers.MetalRetriever(client: Any, params: Optional[dict] = None)[source]#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.PineconeHybridSearchRetriever[source]#
field alpha: float = 0.5#
field embeddings: langchain.embeddings.base.Embeddings [Required]#
field index: Any = None#
field sparse_encoder: Any = None#
field top_k: int = 4#
add_texts(texts: List[str], ids: Optional[List[str]] = None, metadatas: Optional[List[dict]] = None) None[source]#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.RemoteLangChainRetriever[source]#
field headers: Optional[dict] = None#
field input_key: str = 'message'#
field metadata_key: str = 'metadata'#
field page_content_key: str = 'page_content'#
field response_key: str = 'response'#
field url: str [Required]#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.SVMRetriever[source]#
field embeddings: langchain.embeddings.base.Embeddings [Required]#
field index: Any = None#
field k: int = 4#
field relevancy_threshold: Optional[float] = None#
field texts: List[str] [Required]#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

classmethod from_texts(texts: List[str], embeddings: langchain.embeddings.base.Embeddings, **kwargs: Any) langchain.retrievers.svm.SVMRetriever[source]#
get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.SelfQueryRetriever[source]#

Retriever that wraps around a vector store and uses an LLM to generate the vector store queries.

field llm_chain: langchain.chains.llm.LLMChain [Required]#

The LLMChain for generating the vector store queries.

field search_kwargs: dict [Optional]#

Keyword arguments to pass in to the vector store search.

field search_type: str = 'similarity'#

The search type to perform on the vector store.

field structured_query_translator: langchain.chains.query_constructor.ir.Visitor [Required]#

Translator for turning internal query language into vectorstore search params.

field vectorstore: langchain.vectorstores.base.VectorStore [Required]#

The underlying vector store from which documents will be retrieved.

field verbose: bool = False#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

classmethod from_llm(llm: langchain.base_language.BaseLanguageModel, vectorstore: langchain.vectorstores.base.VectorStore, document_contents: str, metadata_field_info: List[langchain.chains.query_constructor.schema.AttributeInfo], structured_query_translator: Optional[langchain.chains.query_constructor.ir.Visitor] = None, chain_kwargs: Optional[Dict] = None, enable_limit: bool = False, **kwargs: Any) langchain.retrievers.self_query.base.SelfQueryRetriever[source]#
get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.TFIDFRetriever[source]#
field docs: List[langchain.schema.Document] [Required]#
field k: int = 4#
field tfidf_array: Any = None#
field vectorizer: Any = None#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

classmethod from_documents(documents: Iterable[langchain.schema.Document], *, tfidf_params: Optional[Dict[str, Any]] = None, **kwargs: Any) langchain.retrievers.tfidf.TFIDFRetriever[source]#
classmethod from_texts(texts: Iterable[str], metadatas: Optional[Iterable[dict]] = None, tfidf_params: Optional[Dict[str, Any]] = None, **kwargs: Any) langchain.retrievers.tfidf.TFIDFRetriever[source]#
get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

pydantic model langchain.retrievers.TimeWeightedVectorStoreRetriever[source]#

Retriever combining embedding similarity with recency.

field decay_rate: float = 0.01#

The exponential decay factor used as (1.0-decay_rate)**(hrs_passed).

field default_salience: Optional[float] = None#

The salience to assign memories not retrieved from the vector store.

None assigns no salience to documents not fetched from the vector store.

field k: int = 4#

The maximum number of documents to retrieve in a given call.

field memory_stream: List[langchain.schema.Document] [Optional]#

The memory_stream of documents to search through.

field other_score_keys: List[str] = []#

Other keys in the metadata to factor into the score, e.g. ‘importance’.

field search_kwargs: dict [Optional]#

Keyword arguments to pass to the vectorstore similarity search.

field vectorstore: langchain.vectorstores.base.VectorStore [Required]#

The vectorstore to store documents and determine salience.

async aadd_documents(documents: List[langchain.schema.Document], **kwargs: Any) List[str][source]#

Add documents to vectorstore.

add_documents(documents: List[langchain.schema.Document], **kwargs: Any) List[str][source]#

Add documents to vectorstore.

async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Return documents that are relevant to the query.

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Return documents that are relevant to the query.

get_salient_docs(query: str) Dict[int, Tuple[langchain.schema.Document, float]][source]#

Return documents that are salient to the query.

class langchain.retrievers.VespaRetriever(app: Vespa, body: Dict, content_field: str, metadata_fields: Optional[Sequence[str]] = None)[source]#
async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

classmethod from_params(url: str, content_field: str, *, k: Optional[int] = None, metadata_fields: Union[Sequence[str], Literal['*']] = (), sources: Optional[Union[Sequence[str], Literal['*']]] = None, _filter: Optional[str] = None, yql: Optional[str] = None, **kwargs: Any) langchain.retrievers.vespa_retriever.VespaRetriever[source]#

Instantiate retriever from params.

Parameters
  • url (str) – Vespa app URL.

  • content_field (str) – Field in results to return as Document page_content.

  • k (Optional[int]) – Number of Documents to return. Defaults to None.

  • metadata_fields (Sequence[str] or "*") – Fields in results to include in document metadata. Defaults to empty tuple ().

  • sources (Sequence[str] or "*" or None) – Sources to retrieve from. Defaults to None.

  • _filter (Optional[str]) – Document filter condition expressed in YQL. Defaults to None.

  • yql (Optional[str]) – Full YQL query to be used. Should not be specified if _filter or sources are specified. Defaults to None.

  • kwargs (Any) – Keyword arguments added to query body.

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents_with_filter(query: str, *, _filter: Optional[str] = None) List[langchain.schema.Document][source]#
class langchain.retrievers.WeaviateHybridSearchRetriever(client: Any, index_name: str, text_key: str, alpha: float = 0.5, k: int = 4, attributes: Optional[List[str]] = None, create_schema_if_missing: bool = True)[source]#
class Config[source]#

Configuration for this pydantic object.

arbitrary_types_allowed = True#
extra = 'forbid'#
add_documents(docs: List[langchain.schema.Document], **kwargs: Any) List[str][source]#

Upload documents to Weaviate.

async aget_relevant_documents(query: str, where_filter: Optional[Dict[str, object]] = None) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str, where_filter: Optional[Dict[str, object]] = None) List[langchain.schema.Document][source]#

Look up similar documents in Weaviate.

pydantic model langchain.retrievers.WikipediaRetriever[source]#

It is effectively a wrapper for WikipediaAPIWrapper. It wraps load() to get_relevant_documents(). It uses all WikipediaAPIWrapper arguments without any change.

async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

class langchain.retrievers.ZepRetriever(session_id: str, url: str, top_k: Optional[int] = None)[source]#

A Retriever implementation for the Zep long-term memory store. Search your user’s long-term chat history with Zep.

Note: You will need to provide the user’s session_id to use this retriever.

More on Zep: Zep provides long-term conversation storage for LLM apps. The server stores, summarizes, embeds, indexes, and enriches conversational AI chat histories, and exposes them via simple, low-latency APIs.

For server installation instructions, see: https://getzep.github.io/deployment/quickstart/

async aget_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents

get_relevant_documents(query: str) List[langchain.schema.Document][source]#

Get documents relevant for a query.

Parameters

query – string to find relevant documents for

Returns

List of relevant documents