Skip to main content
Open on GitHub

Beautiful Soup

Beautiful Soup is a Python package for parsing HTML and XML documents (including having malformed markup, i.e. non-closed tags, so named after tag soup). It creates a parse tree for parsed pages that can be used to extract data from HTML,[3] which is useful for web scraping.

Installation and Setup

pip install beautifulsoup4

Document Transformer

See a usage example.

from langchain_community.document_loaders import BeautifulSoupTransformer

Was this page helpful?