Skip to main content

Image captions

By default, the loader utilizes the pre-trained Salesforce BLIP image captioning model.

This notebook shows how to use the ImageCaptionLoader to generate a query-able index of image captions

#!pip install transformers
from langchain.document_loaders import ImageCaptionLoader

Prepare a list of image urls from Wikimedia

list_image_urls = [

Create the loader

loader = ImageCaptionLoader(path_images=list_image_urls)
list_docs = loader.load()
import requests
from PIL import Image[0], stream=True).raw).convert("RGB")

Create the index

from langchain.indexes import VectorstoreIndexCreator

index = VectorstoreIndexCreator().from_loaders([loader])


query = "What's the painting about?"
query = "What kind of images are there?"