Skip to main content

Google Generative AI Embeddings

Connect to Google's generative AI embeddings service using the GoogleGenerativeAIEmbeddings class, found in the langchain-google-genai package.

Installationโ€‹

%pip install --upgrade --quiet  langchain-google-genai

Credentialsโ€‹

import getpass
import os

if "GOOGLE_API_KEY" not in os.environ:
os.environ["GOOGLE_API_KEY"] = getpass("Provide your Google API key here")

Usageโ€‹

from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vector = embeddings.embed_query("hello, world!")
vector[:5]
[0.05636945, 0.0048285457, -0.0762591, -0.023642512, 0.05329321]

Batchโ€‹

You can also embed multiple strings at once for a processing speedup:

vectors = embeddings.embed_documents(
[
"Today is Monday",
"Today is Tuesday",
"Today is April Fools day",
]
)
len(vectors), len(vectors[0])
(3, 768)

Task typeโ€‹

GoogleGenerativeAIEmbeddings optionally support a task_type, which currently must be one of:

  • task_type_unspecified
  • retrieval_query
  • retrieval_document
  • semantic_similarity
  • classification
  • clustering

By default, we use retrieval_document in the embed_documents method and retrieval_query in the embed_query method. If you provide a task type, we will use that for all methods.

%pip install --upgrade --quiet  matplotlib scikit-learn
Note: you may need to restart the kernel to use updated packages.
query_embeddings = GoogleGenerativeAIEmbeddings(
model="models/embedding-001", task_type="retrieval_query"
)
doc_embeddings = GoogleGenerativeAIEmbeddings(
model="models/embedding-001", task_type="retrieval_document"
)

All of these will be embedded with the 'retrieval_query' task set

query_vecs = [query_embeddings.embed_query(q) for q in [query, query_2, answer_1]]

All of these will be embedded with the 'retrieval_document' task set

doc_vecs = [doc_embeddings.embed_query(q) for q in [query, query_2, answer_1]]

In retrieval, relative distance matters. In the image above, you can see the difference in similarity scores between the "relevant doc" and "simil stronger delta between the similar query and relevant doc on the latter case.

Additional Configurationโ€‹

You can pass the following parameters to ChatGoogleGenerativeAI in order to customize the SDK's behavior:

  • client_options: Client Options to pass to the Google API Client, such as a custom client_options["api_endpoint"]
  • transport: The transport method to use, such as rest, grpc, or grpc_asyncio.

Help us out by providing feedback on this documentation page: