AssemblyAIAudioTranscriptLoader#
- class langchain_community.document_loaders.assemblyai.AssemblyAIAudioTranscriptLoader(file_path: str | Path, *, transcript_format: TranscriptFormat = TranscriptFormat.TEXT, config: assemblyai.TranscriptionConfig | None = None, api_key: str | None = None)[source]#
Load AssemblyAI audio transcripts.
It uses the AssemblyAI API to transcribe audio files and loads the transcribed text into one or more Documents, depending on the specified format.
To use, you should have the
assemblyai
python package installed, and the environment variableASSEMBLYAI_API_KEY
set with your API key. Alternatively, the API key can also be passed as an argument.Audio files can be specified via an URL or a local file path.
Initializes the AssemblyAI AudioTranscriptLoader.
- Parameters:
file_path (Union[str, Path]) – An URL or a local file path.
transcript_format (TranscriptFormat) – Transcript format to use. See class
TranscriptFormat
for more info.config (Optional[assemblyai.TranscriptionConfig]) – Transcription options and features. If
None
is given, the Transcriber’s default configuration will be used.api_key (Optional[str]) – AssemblyAI API key.
Methods
__init__
(file_path, *[, transcript_format, ...])Initializes the AssemblyAI AudioTranscriptLoader.
A lazy loader for Documents.
aload
()Load data into Document objects.
Transcribes the audio file and loads the transcript into documents.
load
()Load data into Document objects.
load_and_split
([text_splitter])Load Documents and split into chunks.
- __init__(file_path: str | Path, *, transcript_format: TranscriptFormat = TranscriptFormat.TEXT, config: assemblyai.TranscriptionConfig | None = None, api_key: str | None = None)[source]#
Initializes the AssemblyAI AudioTranscriptLoader.
- Parameters:
file_path (Union[str, Path]) – An URL or a local file path.
transcript_format (TranscriptFormat) – Transcript format to use. See class
TranscriptFormat
for more info.config (Optional[assemblyai.TranscriptionConfig]) – Transcription options and features. If
None
is given, the Transcriber’s default configuration will be used.api_key (Optional[str]) – AssemblyAI API key.
- async alazy_load() AsyncIterator[Document] #
A lazy loader for Documents.
- Return type:
AsyncIterator[Document]
- lazy_load() Iterator[Document] [source]#
Transcribes the audio file and loads the transcript into documents.
It uses the AssemblyAI API to transcribe the audio file and blocks until the transcription is finished.
- Return type:
Iterator[Document]
- load_and_split(text_splitter: TextSplitter | None = None) list[Document] #
Load Documents and split into chunks. Chunks are returned as Documents.
Do not override this method. It should be considered to be deprecated!
- Parameters:
text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.
- Returns:
List of Documents.
- Return type:
list[Document]
Examples using AssemblyAIAudioTranscriptLoader