Skip to main content
Ctrl+K

🦜🔗 LangChain 0.0.194

Getting Started

  • Quickstart Guide
  • Concepts
  • Tutorials

Modules

  • Models
    • Getting Started
    • LLMs
      • Getting Started
      • Generic Functionality
        • How to use the async API for LLMs
        • How to write a custom LLM wrapper
        • How (and why) to use the fake LLM
        • How (and why) to use the human input LLM
        • How to cache LLM calls
        • How to serialize LLM classes
        • How to stream LLM and Chat Model responses
        • How to track token usage
      • Integrations
        • AI21
        • Aleph Alpha
        • Anyscale
        • Aviary
        • Azure OpenAI
        • Banana
        • Beam
        • Bedrock
        • CerebriumAI
        • Cohere
        • C Transformers
        • Databricks
        • DeepInfra
        • ForefrontAI
        • Google Cloud Platform Vertex AI PaLM
        • GooseAI
        • GPT4All
        • Hugging Face Hub
        • Hugging Face Pipeline
        • Huggingface TextGen Inference
        • Jsonformer
        • Llama-cpp
        • Manifest
        • Modal
        • MosaicML
        • NLP Cloud
        • OpenAI
        • OpenLM
        • Petals
        • PipelineAI
        • Prediction Guard
        • PromptLayer OpenAI
        • ReLLM
        • Replicate
        • Runhouse
        • SageMaker Endpoint
        • StochasticAI
        • Writer
      • Reference
    • Chat Models
      • Getting Started
      • How-To Guides
        • How to use few shot examples
        • How to stream responses
      • Integrations
        • Anthropic
        • Azure
        • Google Vertex AI PaLM
        • OpenAI
        • PromptLayer ChatOpenAI
    • Text Embedding Models
      • Aleph Alpha
      • Amazon Bedrock
      • Azure OpenAI
      • Cohere
      • DeepInfra
      • Elasticsearch
      • Fake Embeddings
      • Google Vertex AI PaLM
      • Hugging Face Hub
      • HuggingFace Instruct
      • Jina
      • Llama-cpp
      • MiniMax
      • ModelScope
      • MosaicML
      • OpenAI
      • SageMaker Endpoint
      • Self Hosted Embeddings
      • Sentence Transformers
      • Tensorflow Hub
  • Prompts
    • Getting Started
    • Prompt Templates
      • Getting Started
      • How-To Guides
        • Connecting to a Feature Store
        • How to create a custom prompt template
        • How to create a prompt template that uses few shot examples
        • How to work with partial Prompt Templates
        • Prompt Composition
        • How to serialize prompts
      • Reference
        • PromptTemplates
        • Example Selector
        • Output Parsers
    • Chat Prompt Templates
    • Example Selectors
      • How to create a custom example selector
      • LengthBased ExampleSelector
      • Maximal Marginal Relevance ExampleSelector
      • NGram Overlap ExampleSelector
      • Similarity ExampleSelector
    • Output Parsers
      • Output Parsers
      • CommaSeparatedListOutputParser
      • Datetime
      • Enum Output Parser
      • OutputFixingParser
      • PydanticOutputParser
      • RetryOutputParser
      • Structured Output Parser
  • Memory
    • Getting Started
    • How-To Guides
      • ConversationBufferMemory
      • ConversationBufferWindowMemory
      • Entity Memory
      • Conversation Knowledge Graph Memory
      • ConversationSummaryMemory
      • ConversationSummaryBufferMemory
      • ConversationTokenBufferMemory
      • VectorStore-Backed Memory
      • How to add Memory to an LLMChain
      • How to add memory to a Multi-Input Chain
      • How to add Memory to an Agent
      • Adding Message Memory backed by a database to an Agent
      • Cassandra Chat Message History
      • How to customize conversational memory
      • How to create a custom Memory class
      • Dynamodb Chat Message History
      • Entity Memory with SQLite storage
      • Momento Chat Message History
      • Mongodb Chat Message History
      • Motörhead Memory
      • Motörhead Memory (Managed)
      • How to use multiple memory classes in the same chain
      • Postgres Chat Message History
      • Redis Chat Message History
      • Zep Memory
  • Indexes
    • Getting Started
    • Document Loaders
      • OpenAIWhisperParser
      • CoNLL-U
      • Copy Paste
      • CSV
      • Email
      • EPub
      • EverNote
      • Microsoft Excel
      • Facebook Chat
      • File Directory
      • HTML
      • Images
      • Jupyter Notebook
      • JSON
      • Markdown
      • Microsoft PowerPoint
      • Microsoft Word
      • Open Document Format (ODT)
      • Pandas DataFrame
      • PDF
      • Sitemap
      • Subtitle
      • Telegram
      • TOML
      • Unstructured File
      • URL
      • WebBaseLoader
      • Weather
      • WhatsApp Chat
      • Arxiv
      • AZLyrics
      • BiliBili
      • College Confidential
      • Gutenberg
      • Hacker News
      • HuggingFace dataset
      • iFixit
      • IMSDb
      • MediaWikiDump
      • Wikipedia
      • YouTube transcripts
      • Airbyte JSON
      • Apify Dataset
      • AWS S3 Directory
      • AWS S3 File
      • Azure Blob Storage Container
      • Azure Blob Storage File
      • Blackboard
      • Blockchain
      • ChatGPT Data
      • Confluence
      • Diffbot
      • Docugami
      • DuckDB
      • Figma
      • GitBook
      • Git
      • Google BigQuery
      • Google Cloud Storage Directory
      • Google Cloud Storage File
      • Google Drive
      • Image captions
      • Iugu
      • Joplin
      • Microsoft OneDrive
      • Modern Treasury
      • Notion DB 2/2
      • Notion DB 1/2
      • Obsidian
      • Psychic
      • PySpark DataFrame Loader
      • ReadTheDocs Documentation
      • Reddit
      • Roam
      • Slack
      • Spreedly
      • Stripe
      • 2Markdown
      • Twitter
    • Text Splitters
      • Getting Started
      • Character
      • CodeTextSplitter
      • NLTK
      • Recursive Character
      • spaCy
      • Tiktoken
      • Hugging Face tokenizer
      • tiktoken (OpenAI) tokenizer
    • Vectorstores
      • Getting Started
      • AnalyticDB
      • Annoy
      • Atlas
      • Chroma
      • ClickHouse Vector Search
      • Deep Lake
      • DocArrayHnswSearch
      • DocArrayInMemorySearch
      • ElasticSearch
      • FAISS
      • LanceDB
      • MatchingEngine
      • Milvus
      • Commented out until further notice
      • MyScale
      • OpenSearch
      • PGVector
      • Pinecone
      • Qdrant
      • Redis
      • SingleStoreDB vector search
      • SKLearnVectorStore
      • Supabase (Postgres)
      • Tair
      • Tigris
      • Typesense
      • Vectara
      • Weaviate
      • Zilliz
    • Retrievers
      • Arxiv
      • AWS Kendra
      • Azure Cognitive Search
      • ChatGPT Plugin
      • Self-querying with Chroma
      • Cohere Reranker
      • Contextual Compression
      • Databerry
      • ElasticSearch BM25
      • kNN
      • Metal
      • Pinecone Hybrid Search
      • PubMed Retriever
      • Self-querying with Qdrant
      • Self-querying
      • SVM
      • TF-IDF
      • Time Weighted VectorStore
      • VectorStore
      • Vespa
      • Weaviate Hybrid Search
      • Self-querying with Weaviate
      • Wikipedia
      • Zep
  • Chains
    • Getting Started
    • How-To Guides
      • Async API for Chain
      • Creating a custom Chain
      • Loading from LangChainHub
      • LLM Chain
      • Router Chains
      • Sequential Chains
      • Serialization
      • Transformation Chain
      • Analyze Document
      • Chat Over Documents with Chat History
      • Graph QA
      • Hypothetical Document Embeddings
      • Question Answering with Sources
      • Question Answering
      • Summarization
      • Retrieval Question/Answering
      • Retrieval Question Answering with Sources
      • Vector DB Text Generation
      • API Chains
      • Self-Critique Chain with Constitutional AI
      • FLARE
      • GraphCypherQAChain
      • NebulaGraphQAChain
      • BashChain
      • LLMCheckerChain
      • LLM Math
      • LLMRequestsChain
      • LLMSummarizationCheckerChain
      • Moderation
      • Router Chains: Selecting from multiple prompts with MultiPromptChain
      • Router Chains: Selecting from multiple prompts with MultiRetrievalQAChain
      • OpenAPI Chain
      • PAL
      • SQL Chain example
    • Reference
  • Agents
    • Getting Started
    • Tools
      • Getting Started
      • Defining Custom Tools
      • Multi-Input Tools
      • Tool Input Schema
      • Apify
      • ArXiv API Tool
      • AWS Lambda API
      • Shell Tool
      • Bing Search
      • Brave Search
      • ChatGPT Plugins
      • DuckDuckGo Search
      • File System Tools
      • Google Places
      • Google Search
      • Google Serper API
      • Gradio Tools
      • GraphQL tool
      • HuggingFace Tools
      • Human as a tool
      • IFTTT WebHooks
      • Metaphor Search
      • OpenWeatherMap API
      • PubMed Tool
      • Python REPL
      • Requests
      • SceneXplain
      • Search Tools
      • SearxNG Search API
      • SerpAPI
      • Twilio
      • Wikipedia
      • Wolfram Alpha
      • YouTubeSearchTool
      • Zapier Natural Language Actions API
    • Agents
      • Agent Types
      • Custom Agent
      • Custom LLM Agent
      • Custom LLM Agent (with a ChatModel)
      • Custom MRKL Agent
      • Custom MultiAction Agent
      • Custom Agent with Tool Retrieval
      • Conversation Agent (for Chat Models)
      • Conversation Agent
      • MRKL
      • MRKL Chat
      • ReAct
      • Self Ask With Search
      • Structured Tool Chat Agent
    • Toolkits
      • Azure Cognitive Services Toolkit
      • CSV Agent
      • Gmail Toolkit
      • Jira
      • JSON Agent
      • OpenAPI agents
      • Natural Language APIs
      • Pandas Dataframe Agent
      • PlayWright Browser Toolkit
      • PowerBI Dataset Agent
      • Python Agent
      • Spark Dataframe Agent
      • Spark SQL Agent
      • SQL Database Agent
      • Vectorstore Agent
    • Agent Executors
      • How to combine agents and vectorstores
      • How to use the async API for Agents
      • How to create ChatGPT Clone
      • Handle Parsing Errors
      • How to access intermediate steps
      • How to cap the max number of iterations
      • How to use a timeout for the agent
      • How to add SharedMemory to an Agent and its Tools
    • Plan and Execute
  • Callbacks

Use Cases

  • Autonomous Agents
  • Agent Simulations
  • Agents
  • Question Answering over Docs
  • Chatbots
  • Querying Tabular Data
  • Code Understanding
  • Interacting with APIs
  • Extraction
  • Summarization
  • Evaluation
    • Agent Benchmarking: Search + Calculator
    • Agent VectorDB Question Answering Benchmarking
    • Benchmarking Template
    • Data Augmented Question Answering
    • Generic Agent Evaluation
    • Using Hugging Face Datasets
    • LLM Math
    • Evaluating an OpenAPI Chain
    • Question Answering Benchmarking: Paul Graham Essay
    • Question Answering Benchmarking: State of the Union Address
    • QA Generation
    • Question Answering
    • SQL Question Answering Benchmarking: Chinook

Reference

  • Installation
  • API References
    • Models
      • LLMs
      • Chat Models
      • Embeddings
    • Prompts
      • PromptTemplates
      • Example Selector
      • Output Parsers
    • Indexes
      • Docstore
      • Text Splitter
      • Document Loaders
      • Vector Stores
      • Retrievers
      • Document Compressors
      • Document Transformers
    • Memory
    • Chains
    • Agents
      • Agents
      • Tools
      • Agent Toolkits
    • Utilities
    • Experimental Modules

Ecosystem

  • Integrations
    • Tracing Walkthrough
    • AI21 Labs
    • Aim
    • Airbyte
    • Aleph Alpha
    • Amazon Bedrock
    • AnalyticDB
    • Annoy
    • Anthropic
    • Anyscale
    • Apify
    • Argilla
    • Arxiv
    • AtlasDB
    • AWS S3 Directory
    • AZLyrics
    • Azure Blob Storage
    • Azure Cognitive Search
    • Azure OpenAI
    • Banana
    • Beam
    • BiliBili
    • Blackboard
    • Cassandra
    • CerebriumAI
    • Chroma
    • ClearML
    • ClickHouse
    • Cohere
    • College Confidential
    • Comet
    • Confluence
    • C Transformers
    • Databerry
    • Databricks
    • DeepInfra
    • Deep Lake
    • Diffbot
    • Discord
    • Docugami
    • DuckDB
    • Elasticsearch
    • EverNote
    • Facebook Chat
    • Figma
    • ForefrontAI
    • Git
    • GitBook
    • Google BigQuery
    • Google Cloud Storage
    • Google Drive
    • Google Search
    • Google Serper
    • Google Vertex AI
    • GooseAI
    • GPT4All
    • Graphsignal
    • Gutenberg
    • Hacker News
    • Hazy Research
    • Helicone
    • Hugging Face
    • iFixit
    • IMSDb
    • Jina
    • LanceDB
    • Llama.cpp
    • MediaWikiDump
    • Metal
    • Microsoft OneDrive
    • Microsoft PowerPoint
    • Microsoft Word
    • Milvus
    • MLflow
    • Modal
    • Modern Treasury
    • Momento
    • MyScale
    • NLPCloud
    • Notion DB
    • Obsidian
    • OpenAI
    • OpenSearch
    • OpenWeatherMap
    • Petals
    • PGVector
    • Pinecone
    • PipelineAI
    • Prediction Guard
    • PromptLayer
    • Psychic
    • Qdrant
    • Ray Serve
    • Rebuff
    • Reddit
    • Redis
    • Replicate
    • Roam
    • Runhouse
    • RWKV-4
    • SageMaker Endpoint
    • SearxNG Search API
    • SerpAPI
    • Shale Protocol
    • scikit-learn
    • Slack
    • spaCy
    • Spreedly
    • StochasticAI
    • Stripe
    • Tair
    • Telegram
    • Tensorflow Hub
    • 2Markdown
    • Trello
    • Twitter
    • Unstructured
    • Vectara
    • Vespa
    • Weights & Biases
    • Weather
    • Weaviate
    • WhatsApp
    • WhyLabs
    • Wikipedia
    • Wolfram Alpha
    • Writer
    • Yeager.ai
    • YouTube
    • Zep
    • Zilliz
  • Dependents
  • Deployments

Additional Resources

  • LangChainHub
  • Deploying LLMs in Production
  • Gallery
  • Tracing
  • Model Comparison
  • Discord
  • YouTube
  • Production Support
  • .ipynb

Weaviate Hybrid Search

Weaviate Hybrid Search#

Weaviate is an open source vector database.

Hybrid search is a technique that combines multiple search algorithms to improve the accuracy and relevance of search results. It uses the best features of both keyword-based search algorithms with vector search techniques.

The Hybrid search in Weaviate uses sparse and dense vectors to represent the meaning and context of search queries and documents.

This notebook shows how to use Weaviate hybrid search as a LangChain retriever.

Set up the retriever:

#!pip install weaviate-client
import weaviate
import os

WEAVIATE_URL = os.getenv("WEAVIATE_URL")
client = weaviate.Client(
    url=WEAVIATE_URL,
    auth_client_secret=weaviate.AuthApiKey(api_key=os.getenv("WEAVIATE_API_KEY")),
    additional_headers={
        "X-Openai-Api-Key": os.getenv("OPENAI_API_KEY"),
    },
)

# client.schema.delete_all()
from langchain.retrievers.weaviate_hybrid_search import WeaviateHybridSearchRetriever
from langchain.schema import Document
/workspaces/langchain/langchain/vectorstores/analyticdb.py:20: MovedIn20Warning: The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base(). (deprecated since: 2.0) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
  Base = declarative_base()  # type: Any
retriever = WeaviateHybridSearchRetriever(
    client, index_name="LangChain", text_key="text"
)

Add some data:

docs = [
    Document(
        metadata={
            "title": "Embracing The Future: AI Unveiled",
            "author": "Dr. Rebecca Simmons",
        },
        page_content="A comprehensive analysis of the evolution of artificial intelligence, from its inception to its future prospects. Dr. Simmons covers ethical considerations, potentials, and threats posed by AI.",
    ),
    Document(
        metadata={
            "title": "Symbiosis: Harmonizing Humans and AI",
            "author": "Prof. Jonathan K. Sterling",
        },
        page_content="Prof. Sterling explores the potential for harmonious coexistence between humans and artificial intelligence. The book discusses how AI can be integrated into society in a beneficial and non-disruptive manner.",
    ),
    Document(
        metadata={"title": "AI: The Ethical Quandary", "author": "Dr. Rebecca Simmons"},
        page_content="In her second book, Dr. Simmons delves deeper into the ethical considerations surrounding AI development and deployment. It is an eye-opening examination of the dilemmas faced by developers, policymakers, and society at large.",
    ),
    Document(
        metadata={
            "title": "Conscious Constructs: The Search for AI Sentience",
            "author": "Dr. Samuel Cortez",
        },
        page_content="Dr. Cortez takes readers on a journey exploring the controversial topic of AI consciousness. The book provides compelling arguments for and against the possibility of true AI sentience.",
    ),
    Document(
        metadata={
            "title": "Invisible Routines: Hidden AI in Everyday Life",
            "author": "Prof. Jonathan K. Sterling",
        },
        page_content="In his follow-up to 'Symbiosis', Prof. Sterling takes a look at the subtle, unnoticed presence and influence of AI in our everyday lives. It reveals how AI has become woven into our routines, often without our explicit realization.",
    ),
]
retriever.add_documents(docs)
['eda16d7d-437d-4613-84ae-c2e38705ec7a',
 '04b501bf-192b-4e72-be77-2fbbe7e67ebf',
 '18a1acdb-23b7-4482-ab04-a6c2ed51de77',
 '88e82cc3-c020-4b5a-b3c6-ca7cf3fc6a04',
 'f6abd9d5-32ed-46c4-bd08-f8d0f7c9fc95']

Do a hybrid search:

retriever.get_relevant_documents("the ethical implications of AI")
[Document(page_content='In her second book, Dr. Simmons delves deeper into the ethical considerations surrounding AI development and deployment. It is an eye-opening examination of the dilemmas faced by developers, policymakers, and society at large.', metadata={}),
 Document(page_content='A comprehensive analysis of the evolution of artificial intelligence, from its inception to its future prospects. Dr. Simmons covers ethical considerations, potentials, and threats posed by AI.', metadata={}),
 Document(page_content="In his follow-up to 'Symbiosis', Prof. Sterling takes a look at the subtle, unnoticed presence and influence of AI in our everyday lives. It reveals how AI has become woven into our routines, often without our explicit realization.", metadata={}),
 Document(page_content='Prof. Sterling explores the potential for harmonious coexistence between humans and artificial intelligence. The book discusses how AI can be integrated into society in a beneficial and non-disruptive manner.', metadata={})]

Do a hybrid search with where filter:

retriever.get_relevant_documents(
    "AI integration in society",
    where_filter={
        "path": ["author"],
        "operator": "Equal",
        "valueString": "Prof. Jonathan K. Sterling",
    },
)
[Document(page_content='Prof. Sterling explores the potential for harmonious coexistence between humans and artificial intelligence. The book discusses how AI can be integrated into society in a beneficial and non-disruptive manner.', metadata={}),
 Document(page_content="In his follow-up to 'Symbiosis', Prof. Sterling takes a look at the subtle, unnoticed presence and influence of AI in our everyday lives. It reveals how AI has become woven into our routines, often without our explicit realization.", metadata={})]

previous

Vespa

next

Self-querying with Weaviate

By Harrison Chase

© Copyright 2023, Harrison Chase.

Last updated on Jun 08, 2023.