TesseractBlobParser#
- class langchain_community.document_loaders.parsers.images.TesseractBlobParser(*, langs: Iterable[str] = ('eng',))[source]#
Parse for extracting text from images using the Tesseract OCR library.
Initialize the TesseractBlobParser.
- Parameters:
langs (list[str]) β The languages to use for OCR.
Methods
__init__
(*[,Β langs])Initialize the TesseractBlobParser.
lazy_parse
(blob)Lazily parse a blob and yields Documents containing the parsed content.
parse
(blob)Eagerly parse the blob into a document or documents.
- __init__(*, langs: Iterable[str] = ('eng',))[source]#
Initialize the TesseractBlobParser.
- Parameters:
langs (list[str]) β The languages to use for OCR.
- lazy_parse(blob: Blob) Iterator[Document] #
Lazily parse a blob and yields Documents containing the parsed content.