Crawler#

class langchain.chains.natbot.crawler.Crawler[source]#

A crawler for web pages.

Security Note: This is an implementation of a crawler that uses a browser via

Playwright.

This crawler can be used to load arbitrary webpages INCLUDING content from the local file system.

Control access to who can submit crawling requests and what network access the crawler has.

Make sure to scope permissions to the minimal permissions necessary for the application.

See https://python.langchain.com/docs/security for more information.

Methods

__init__()

click(id)

crawl()

enter()

go_to_page(url)

scroll(direction)

type(id,Β text)

__init__() β†’ None[source]#
Return type:

None

click(id: str | int) β†’ None[source]#
Parameters:

id (str | int) –

Return type:

None

crawl() β†’ List[str][source]#
Return type:

List[str]

enter() β†’ None[source]#
Return type:

None

go_to_page(url: str) β†’ None[source]#
Parameters:

url (str) –

Return type:

None

scroll(direction: str) β†’ None[source]#
Parameters:

direction (str) –

Return type:

None

type(id: str | int, text: str) β†’ None[source]#
Parameters:
  • id (str | int) –

  • text (str) –

Return type:

None