vectara

Overview

Vectara is the trusted AI Assistant and Agent platform which focuses on enterprise readiness for mission-critical applications. Vectara serverless RAG-as-a-service provides all the components of RAG behind an easy-to-use API, including:

A way to extract text from files (PDF, PPT, DOCX, etc)
ML-based chunking that provides state of the art performance.
The Boomerang embeddings model.
Its own internal vector database where text chunks and embedding vectors are stored.
A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments, including support for Hybrid Search as well as multiple reranking options such as the multi-lingual relevance reranker, MMR, UDF reranker.
An LLM to for creating a generative summary, based on the retrieved documents (context), including citations.

For more information:

This notebook shows how to use Vectara's Chat functionality, which provides automatic storage of conversation history and ensures follow up questions consider that history.

Setup

To use the VectaraVectorStore you first need to install the partner package.

!uv pip install -U pip && uv pip install -qU langchain-vectara

Getting Started

To get started, use the following steps:

If you don't already have one, Sign up for your free Vectara trial.
Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingest from input documents. To create a corpus, use the "Create Corpus" button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.
Next you'll need to create API keys to access the corpus. Click on the "Access Control" tab in the corpus view and then the "Create API Key" button. Give your key a name, and choose whether you want query-only or query+index for your key. Click "Create" and you now have an active API key. Keep this key confidential.

To use LangChain with Vectara, you'll need to have these two values: corpus_key and api_key. You can provide VECTARA_API_KEY to LangChain in two ways:

Instantiation

Include in your environment these two variables: VECTARA_API_KEY.

For example, you can set these variables using os.environ and getpass as follows:

import os
import getpass

os.environ["VECTARA_API_KEY"] = getpass.getpass("Vectara API Key:")

Add them to the Vectara vectorstore constructor:

vectara = Vectara(
    vectara_api_key=vectara_api_key
)

In this notebook we assume they are provided in the environment.

import os

os.environ["VECTARA_API_KEY"] = "<VECTARA_API_KEY>"
os.environ["VECTARA_CORPUS_KEY"] = "<VECTARA_CORPUS_KEY>"

from langchain_vectara import Vectara
from langchain_vectara.vectorstores import (
    CorpusConfig,
    GenerationConfig,
    MmrReranker,
    SearchConfig,
    VectaraQueryConfig,
)

Vectara Chat Explained

In most uses of LangChain to create chatbots, one must integrate a special memory component that maintains the history of chat sessions and then uses that history to ensure the chatbot is aware of conversation history.

With Vectara Chat - all of that is performed in the backend by Vectara automatically. You can look at the Chat documentation for the details, to learn more about the internals of how this is implemented, but with LangChain all you have to do is turn that feature on in the Vectara vectorstore.

Let's see an example. First we load the SOTU document (remember, text extraction and chunking all occurs automatically on the Vectara platform):

from langchain_community.document_loaders import TextLoader

loader = TextLoader("../document_loaders/example_data/state_of_the_union.txt")
documents = loader.load()

corpus_key = os.getenv("VECTARA_CORPUS_KEY")
vectara = Vectara.from_documents(documents, embedding=None, corpus_key=corpus_key)

API Reference:TextLoader

And now we create a Chat Runnable using the as_chat method:

generation_config = GenerationConfig(
    max_used_search_results=7,
    response_language="eng",
    generation_preset_name="vectara-summary-ext-24-05-med-omni",
    enable_factual_consistency_score=True,
)
search_config = SearchConfig(
    corpora=[CorpusConfig(corpus_key=corpus_key, limit=25)],
    reranker=MmrReranker(diversity_bias=0.2),
)

config = VectaraQueryConfig(
    search=search_config,
    generation=generation_config,
)


bot = vectara.as_chat(config)

Invocation

Here's an example of asking a question with no chat history

bot.invoke("What did the president say about Ketanji Brown Jackson?")["answer"]

'The president stated that nominating someone to serve on the United States Supreme Court is one of the most serious constitutional responsibilities. He nominated Circuit Court of Appeals Judge Ketanji Brown Jackson, describing her as one of the nation’s top legal minds who will continue Justice Breyer’s legacy of excellence and noting her experience as a former top litigator in private practice [1].'

Here's an example of asking a question with some chat history

bot.invoke("Did he mention who she suceeded?")["answer"]

'Yes, the president mentioned that Ketanji Brown Jackson succeeded Justice Breyer [1].'

Chat with streaming

Of course the chatbot interface also supports streaming. Instead of the invoke method you simply use stream:

output = {}
curr_key = None
for chunk in bot.stream("what did he said about the covid?"):
    for key in chunk:
        if key not in output:
            output[key] = chunk[key]
        else:
            output[key] += chunk[key]
        if key == "answer":
            print(chunk[key], end="", flush=True)
        curr_key = key

The president acknowledged the significant impact of COVID-19 on the nation, expressing understanding of the public's fatigue and frustration. He emphasized the need to view COVID-19 not as a partisan issue but as a serious disease, urging unity among Americans. The president highlighted the progress made, noting that severe cases have decreased significantly, and mentioned new CDC guidelines allowing most Americans to be mask-free. He also pointed out the efforts to vaccinate the nation and provide economic relief, and the ongoing commitment to vaccinate the world [2], [3], [5].

Chaining

For additional capabilities you can use chaining.

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0)

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that explains the stuff to a five year old.  Vectara is providing the answer.",
        ),
        ("human", "{vectara_response}"),
    ]
)


def get_vectara_response(question: dict) -> str:
    """
    Calls Vectara as_chat and returns the answer string.  This encapsulates
    the Vectara call.
    """
    try:
        response = bot.invoke(question["question"])
        return response["answer"]
    except Exception as e:
        return "I'm sorry, I couldn't get an answer from Vectara."


# Create the chain
chain = get_vectara_response | prompt | llm | StrOutputParser()


# Invoke the chain
result = chain.invoke({"question": "what did he say about the covid?"})
print(result)

API Reference:StrOutputParser | ChatPromptTemplate | ChatOpenAI

So, the president talked about how the COVID-19 sickness has affected a lot of people in the country. He said that it's important for everyone to work together to fight the sickness, no matter what political party they are in. The president also mentioned that they are working hard to give vaccines to people to help protect them from getting sick. They are also giving money and help to people who need it, like food, housing, and cheaper health insurance. The president also said that they are sending vaccines to many other countries to help people all around the world stay healthy.

API reference

You can look at the Chat documentation for the details.

Chat model conceptual guide
Chat model how-to guides

Overview​

Setup​

Getting Started​

Instantiation​

Vectara Chat Explained​

Invocation​

Chat with streaming​

Chaining​

API reference​

Related​

Was this page helpful?