Skip to main content

Text Embeddings Inference

Hugging Face Text Embeddings Inference (TEI) is a toolkit for deploying and serving open-source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5.

To use it within langchain, first install huggingface-hub.

%pip install --upgrade huggingface-hub

Then expose an embedding model using TEI. For instance, using Docker, you can serve BAAI/bge-large-en-v1.5 as follows:

volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run

docker run --gpus all -p 8080:80 -v $volume:/data --pull always --model-id $model --revision $revision

Finally, instantiate the client and embed your texts.

from langchain_huggingface.embeddings import HuggingFaceEndpointEmbeddings
embeddings = HuggingFaceEndpointEmbeddings(model="http://localhost:8080")
text = "What is deep learning?"
query_result = embeddings.embed_query(text)
[0.018113142, 0.00302585, -0.049911194]
doc_result = embeddings.embed_documents([text])
[0.018113142, 0.00302585, -0.049911194]

Was this page helpful?

You can also leave detailed feedback on GitHub.