GrobidParser#
- class langchain_community.document_loaders.parsers.grobid.GrobidParser(segment_sentences: bool, grobid_server: str = 'http://localhost:8070/api/processFulltextDocument')[source]#
Load article PDF files using Grobid.
Methods
__init__
(segment_sentences[,Β grobid_server])lazy_parse
(blob)Lazy parsing interface.
parse
(blob)Eagerly parse the blob into a document or documents.
process_xml
(file_path,Β xml_data,Β ...)Process the XML file from Grobin.
- Parameters:
segment_sentences (bool)
grobid_server (str)
- __init__(segment_sentences: bool, grobid_server: str = 'http://localhost:8070/api/processFulltextDocument') None [source]#
- Parameters:
segment_sentences (bool)
grobid_server (str)
- Return type:
None
- lazy_parse(blob: Blob) Iterator[Document] [source]#
Lazy parsing interface.
Subclasses are required to implement this method.
Examples using GrobidParser