Skip to main content

OpenAI

Let's load the OpenAI Embedding class.

Setupโ€‹

First we install langchain-openai and set the required env vars

%pip install -qU langchain-openai
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
text = "This is a test document."

Usageโ€‹

Embed queryโ€‹

query_result = embeddings.embed_query(text)
Warning: model not found. Using cl100k_base encoding.
query_result[:5]
[-0.014380056377383358,
-0.027191711627651764,
-0.020042716111860304,
0.057301379620345545,
-0.022267658631828974]

Embed documentsโ€‹

doc_result = embeddings.embed_documents([text])
Warning: model not found. Using cl100k_base encoding.
doc_result[0][:5]
[-0.014380056377383358,
-0.027191711627651764,
-0.020042716111860304,
0.057301379620345545,
-0.022267658631828974]

Specify dimensionsโ€‹

With the text-embedding-3 class of models, you can specify the size of the embeddings you want returned. For example by default text-embedding-3-large returned embeddings of dimension 3072:

len(doc_result[0])
3072

But by passing in dimensions=1024 we can reduce the size of our embeddings to 1024:

embeddings_1024 = OpenAIEmbeddings(model="text-embedding-3-large", dimensions=1024)
len(embeddings_1024.embed_documents([text])[0])
Warning: model not found. Using cl100k_base encoding.
1024

Help us out by providing feedback on this documentation page: