BS4HTMLParser#
- class langchain_community.document_loaders.parsers.html.bs4.BS4HTMLParser(*, features: str = 'lxml', get_text_separator: str = '', **kwargs: Any)[source]#
Parse HTML files using Beautiful Soup.
Initialize a bs4 based HTML parser.
Methods
__init__
(*[,Β features,Β get_text_separator])Initialize a bs4 based HTML parser.
lazy_parse
(blob)Load HTML document into document objects.
parse
(blob)Eagerly parse the blob into a document or documents.
- Parameters:
features (str) β
get_text_separator (str) β
kwargs (Any) β
- __init__(*, features: str = 'lxml', get_text_separator: str = '', **kwargs: Any) None [source]#
Initialize a bs4 based HTML parser.
- Parameters:
features (str) β
get_text_separator (str) β
kwargs (Any) β
- Return type:
None