Skip to main content
Open In ColabOpen on GitHub

PullMdLoader

Loader for converting URLs into Markdown using the pull.md service.

This package implements a document loader for web content. Unlike traditional web scrapers, PullMdLoader can handle web pages built with dynamic JavaScript frameworks like React, Angular, or Vue.js, converting them into Markdown without local rendering.

Overviewโ€‹

Integration detailsโ€‹

ClassPackageLocalSerializableJS Support
PullMdLoaderlangchain-pull-mdโœ…โœ…โŒ

Setupโ€‹

Installationโ€‹

pip install langchain-pull-md

Initializationโ€‹

from langchain_pull_md.markdown_loader import PullMdLoader

# Instantiate the loader with a URL
loader = PullMdLoader(url="https://example.com")

Loadโ€‹

documents = loader.load()
documents[0].metadata
{'source': 'https://example.com',
'page_content': '# Example Domain\nThis domain is used for illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.'}

Lazy Loadโ€‹

No lazy loading is implemented. PullMdLoader performs a real-time conversion of the provided URL into Markdown format whenever the load method is called.

API reference:โ€‹


Was this page helpful?