YoutubeLoader#

class langchain_community.document_loaders.youtube.YoutubeLoader(video_id: str, add_video_info: bool = False, language: str | Sequence[str] = 'en', translation: str | None = None, transcript_format: TranscriptFormat = TranscriptFormat.TEXT, continue_on_failure: bool = False, chunk_size_seconds: int = 120)[source]#

Load YouTube video transcripts.

Initialize with YouTube video ID.

Methods

__init__(video_id[, add_video_info, ...])

Initialize with YouTube video ID.

alazy_load()

A lazy loader for Documents.

aload()

Load data into Document objects.

extract_video_id(youtube_url)

Extract video ID from common YouTube URLs.

from_youtube_url(youtube_url, **kwargs)

Given a YouTube URL, construct a loader.

lazy_load()

A lazy loader for Documents.

load()

Load YouTube transcripts into Document objects.

load_and_split([text_splitter])

Load Documents and split into chunks.

Parameters:
  • video_id (str)

  • add_video_info (bool)

  • language (Union[str, Sequence[str]])

  • translation (Optional[str])

  • transcript_format (TranscriptFormat)

  • continue_on_failure (bool)

  • chunk_size_seconds (int)

__init__(video_id: str, add_video_info: bool = False, language: str | Sequence[str] = 'en', translation: str | None = None, transcript_format: TranscriptFormat = TranscriptFormat.TEXT, continue_on_failure: bool = False, chunk_size_seconds: int = 120)[source]#

Initialize with YouTube video ID.

Parameters:
  • video_id (str)

  • add_video_info (bool)

  • language (str | Sequence[str])

  • translation (str | None)

  • transcript_format (TranscriptFormat)

  • continue_on_failure (bool)

  • chunk_size_seconds (int)

async alazy_load() → AsyncIterator[Document]#

A lazy loader for Documents.

Return type:

AsyncIterator[Document]

async aload() → list[Document]#

Load data into Document objects.

Return type:

list[Document]

static extract_video_id(youtube_url: str) → str[source]#

Extract video ID from common YouTube URLs.

Parameters:

youtube_url (str)

Return type:

str

classmethod from_youtube_url(youtube_url: str, **kwargs: Any) → YoutubeLoader[source]#

Given a YouTube URL, construct a loader. See YoutubeLoader() constructor for a list of keyword arguments.

Parameters:
  • youtube_url (str)

  • kwargs (Any)

Return type:

YoutubeLoader

lazy_load() → Iterator[Document]#

A lazy loader for Documents.

Return type:

Iterator[Document]

load() → List[Document][source]#

Load YouTube transcripts into Document objects.

Return type:

List[Document]

load_and_split(text_splitter: TextSplitter | None = None) → list[Document]#

Load Documents and split into chunks. Chunks are returned as Documents.

Do not override this method. It should be considered to be deprecated!

Parameters:

text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.

Returns:

List of Documents.

Return type:

list[Document]

Examples using YoutubeLoader