Retrievers#
- pydantic model langchain.retrievers.ArxivRetriever[source]#
It is effectively a wrapper for ArxivAPIWrapper. It wraps load() to get_relevant_documents(). It uses all ArxivAPIWrapper arguments without any change.
- pydantic model langchain.retrievers.AzureCognitiveSearchRetriever[source]#
Wrapper around Azure Cognitive Search.
- field aiosession: Optional[aiohttp.client.ClientSession] = None#
ClientSession, in case we want to reuse connection for better performance.
- field api_key: str = ''#
API Key. Both Admin and Query keys work, but for reading data it’s recommended to use a Query key.
- field api_version: str = '2020-06-30'#
API version
- field content_key: str = 'content'#
Key in a retrieved result to set as the Document page_content.
- field index_name: str = ''#
Name of Index inside Azure Cognitive Search service
- field service_name: str = ''#
Name of Azure Cognitive Search service
- pydantic model langchain.retrievers.ChatGPTPluginRetriever[source]#
- field aiosession: Optional[aiohttp.client.ClientSession] = None#
- field bearer_token: str [Required]#
- field filter: Optional[dict] = None#
- field top_k: int = 3#
- field url: str [Required]#
- pydantic model langchain.retrievers.ContextualCompressionRetriever[source]#
Retriever that wraps a base retriever and compresses the results.
- field base_compressor: langchain.retrievers.document_compressors.base.BaseDocumentCompressor [Required]#
Compressor for compressing retrieved documents.
- field base_retriever: langchain.schema.BaseRetriever [Required]#
Base Retriever to use for getting relevant documents.
- class langchain.retrievers.DataberryRetriever(datastore_url: str, top_k: Optional[int] = None, api_key: Optional[str] = None)[source]#
- async aget_relevant_documents(query: str) List[langchain.schema.Document] [source]#
Get documents relevant for a query.
- Parameters
query – string to find relevant documents for
- Returns
List of relevant documents
- api_key: Optional[str]#
- datastore_url: str#
- get_relevant_documents(query: str) List[langchain.schema.Document] [source]#
Get documents relevant for a query.
- Parameters
query – string to find relevant documents for
- Returns
List of relevant documents
- top_k: Optional[int]#
- class langchain.retrievers.ElasticSearchBM25Retriever(client: Any, index_name: str)[source]#
Wrapper around Elasticsearch using BM25 as a retrieval method.
To connect to an Elasticsearch instance that requires login credentials, including Elastic Cloud, use the Elasticsearch URL format https://username:password@es_host:9243. For example, to connect to Elastic Cloud, create the Elasticsearch URL with the required authentication details and pass it to the ElasticVectorSearch constructor as the named parameter elasticsearch_url.
You can obtain your Elastic Cloud URL and login credentials by logging in to the Elastic Cloud console at https://cloud.elastic.co, selecting your deployment, and navigating to the “Deployments” page.
To obtain your Elastic Cloud password for the default “elastic” user:
Log in to the Elastic Cloud console at https://cloud.elastic.co
Go to “Security” > “Users”
Locate the “elastic” user and click “Edit”
Click “Reset password”
Follow the prompts to reset the password
The format for Elastic Cloud URLs is https://username:password@cluster_id.region_id.gcp.cloud.es.io:9243.
- add_texts(texts: Iterable[str], refresh_indices: bool = True) List[str] [source]#
Run more texts through the embeddings and add to the retriver.
- Parameters
texts – Iterable of strings to add to the retriever.
refresh_indices – bool to refresh ElasticSearch indices
- Returns
List of ids from adding the texts into the retriever.
- async aget_relevant_documents(query: str) List[langchain.schema.Document] [source]#
Get documents relevant for a query.
- Parameters
query – string to find relevant documents for
- Returns
List of relevant documents
- classmethod create(elasticsearch_url: str, index_name: str, k1: float = 2.0, b: float = 0.75) langchain.retrievers.elastic_search_bm25.ElasticSearchBM25Retriever [source]#
- pydantic model langchain.retrievers.KNNRetriever[source]#
- field embeddings: langchain.embeddings.base.Embeddings [Required]#
- field index: Any = None#
- field k: int = 4#
- field relevancy_threshold: Optional[float] = None#
- field texts: List[str] [Required]#
- async aget_relevant_documents(query: str) List[langchain.schema.Document] [source]#
Get documents relevant for a query.
- Parameters
query – string to find relevant documents for
- Returns
List of relevant documents
- classmethod from_texts(texts: List[str], embeddings: langchain.embeddings.base.Embeddings, **kwargs: Any) langchain.retrievers.knn.KNNRetriever [source]#
- class langchain.retrievers.MetalRetriever(client: Any, params: Optional[dict] = None)[source]#
- pydantic model langchain.retrievers.PineconeHybridSearchRetriever[source]#
- field alpha: float = 0.5#
- field embeddings: langchain.embeddings.base.Embeddings [Required]#
- field index: Any = None#
- field sparse_encoder: Any = None#
- field top_k: int = 4#
- add_texts(texts: List[str], ids: Optional[List[str]] = None, metadatas: Optional[List[dict]] = None) None [source]#
- pydantic model langchain.retrievers.RemoteLangChainRetriever[source]#
- field headers: Optional[dict] = None#
- field input_key: str = 'message'#
- field metadata_key: str = 'metadata'#
- field page_content_key: str = 'page_content'#
- field response_key: str = 'response'#
- field url: str [Required]#
- pydantic model langchain.retrievers.SVMRetriever[source]#
- field embeddings: langchain.embeddings.base.Embeddings [Required]#
- field index: Any = None#
- field k: int = 4#
- field relevancy_threshold: Optional[float] = None#
- field texts: List[str] [Required]#
- async aget_relevant_documents(query: str) List[langchain.schema.Document] [source]#
Get documents relevant for a query.
- Parameters
query – string to find relevant documents for
- Returns
List of relevant documents
- classmethod from_texts(texts: List[str], embeddings: langchain.embeddings.base.Embeddings, **kwargs: Any) langchain.retrievers.svm.SVMRetriever [source]#
- pydantic model langchain.retrievers.SelfQueryRetriever[source]#
Retriever that wraps around a vector store and uses an LLM to generate the vector store queries.
- field llm_chain: langchain.chains.llm.LLMChain [Required]#
The LLMChain for generating the vector store queries.
- field search_kwargs: dict [Optional]#
Keyword arguments to pass in to the vector store search.
- field search_type: str = 'similarity'#
The search type to perform on the vector store.
- field structured_query_translator: langchain.chains.query_constructor.ir.Visitor [Required]#
Translator for turning internal query language into vectorstore search params.
- field vectorstore: langchain.vectorstores.base.VectorStore [Required]#
The underlying vector store from which documents will be retrieved.
- field verbose: bool = False#
- async aget_relevant_documents(query: str) List[langchain.schema.Document] [source]#
Get documents relevant for a query.
- Parameters
query – string to find relevant documents for
- Returns
List of relevant documents
- classmethod from_llm(llm: langchain.base_language.BaseLanguageModel, vectorstore: langchain.vectorstores.base.VectorStore, document_contents: str, metadata_field_info: List[langchain.chains.query_constructor.schema.AttributeInfo], structured_query_translator: Optional[langchain.chains.query_constructor.ir.Visitor] = None, chain_kwargs: Optional[Dict] = None, enable_limit: bool = False, **kwargs: Any) langchain.retrievers.self_query.base.SelfQueryRetriever [source]#
- pydantic model langchain.retrievers.TFIDFRetriever[source]#
- field docs: List[langchain.schema.Document] [Required]#
- field k: int = 4#
- field tfidf_array: Any = None#
- field vectorizer: Any = None#
- async aget_relevant_documents(query: str) List[langchain.schema.Document] [source]#
Get documents relevant for a query.
- Parameters
query – string to find relevant documents for
- Returns
List of relevant documents
- classmethod from_documents(documents: Iterable[langchain.schema.Document], *, tfidf_params: Optional[Dict[str, Any]] = None, **kwargs: Any) langchain.retrievers.tfidf.TFIDFRetriever [source]#
- classmethod from_texts(texts: Iterable[str], metadatas: Optional[Iterable[dict]] = None, tfidf_params: Optional[Dict[str, Any]] = None, **kwargs: Any) langchain.retrievers.tfidf.TFIDFRetriever [source]#
- pydantic model langchain.retrievers.TimeWeightedVectorStoreRetriever[source]#
Retriever combining embedding similarity with recency.
- field decay_rate: float = 0.01#
The exponential decay factor used as (1.0-decay_rate)**(hrs_passed).
- field default_salience: Optional[float] = None#
The salience to assign memories not retrieved from the vector store.
None assigns no salience to documents not fetched from the vector store.
- field k: int = 4#
The maximum number of documents to retrieve in a given call.
- field memory_stream: List[langchain.schema.Document] [Optional]#
The memory_stream of documents to search through.
- field other_score_keys: List[str] = []#
Other keys in the metadata to factor into the score, e.g. ‘importance’.
- field search_kwargs: dict [Optional]#
Keyword arguments to pass to the vectorstore similarity search.
- field vectorstore: langchain.vectorstores.base.VectorStore [Required]#
The vectorstore to store documents and determine salience.
- async aadd_documents(documents: List[langchain.schema.Document], **kwargs: Any) List[str] [source]#
Add documents to vectorstore.
- add_documents(documents: List[langchain.schema.Document], **kwargs: Any) List[str] [source]#
Add documents to vectorstore.
- async aget_relevant_documents(query: str) List[langchain.schema.Document] [source]#
Return documents that are relevant to the query.
- class langchain.retrievers.VespaRetriever(app: Vespa, body: Dict, content_field: str, metadata_fields: Optional[Sequence[str]] = None)[source]#
- async aget_relevant_documents(query: str) List[langchain.schema.Document] [source]#
Get documents relevant for a query.
- Parameters
query – string to find relevant documents for
- Returns
List of relevant documents
- classmethod from_params(url: str, content_field: str, *, k: Optional[int] = None, metadata_fields: Union[Sequence[str], Literal['*']] = (), sources: Optional[Union[Sequence[str], Literal['*']]] = None, _filter: Optional[str] = None, yql: Optional[str] = None, **kwargs: Any) langchain.retrievers.vespa_retriever.VespaRetriever [source]#
Instantiate retriever from params.
- Parameters
url (str) – Vespa app URL.
content_field (str) – Field in results to return as Document page_content.
k (Optional[int]) – Number of Documents to return. Defaults to None.
metadata_fields (Sequence[str] or "*") – Fields in results to include in document metadata. Defaults to empty tuple ().
sources (Sequence[str] or "*" or None) – Sources to retrieve from. Defaults to None.
_filter (Optional[str]) – Document filter condition expressed in YQL. Defaults to None.
yql (Optional[str]) – Full YQL query to be used. Should not be specified if _filter or sources are specified. Defaults to None.
kwargs (Any) – Keyword arguments added to query body.
- class langchain.retrievers.WeaviateHybridSearchRetriever(client: Any, index_name: str, text_key: str, alpha: float = 0.5, k: int = 4, attributes: Optional[List[str]] = None, create_schema_if_missing: bool = True)[source]#
- class Config[source]#
Configuration for this pydantic object.
- arbitrary_types_allowed = True#
- extra = 'forbid'#
- add_documents(docs: List[langchain.schema.Document], **kwargs: Any) List[str] [source]#
Upload documents to Weaviate.
- pydantic model langchain.retrievers.WikipediaRetriever[source]#
It is effectively a wrapper for WikipediaAPIWrapper. It wraps load() to get_relevant_documents(). It uses all WikipediaAPIWrapper arguments without any change.
- class langchain.retrievers.ZepRetriever(session_id: str, url: str, top_k: Optional[int] = None)[source]#
A Retriever implementation for the Zep long-term memory store. Search your user’s long-term chat history with Zep.
Note: You will need to provide the user’s session_id to use this retriever.
More on Zep: Zep provides long-term conversation storage for LLM apps. The server stores, summarizes, embeds, indexes, and enriches conversational AI chat histories, and exposes them via simple, low-latency APIs.
For server installation instructions, see: https://getzep.github.io/deployment/quickstart/