UnstructuredHtmlEvaluator#

class langchain_community.document_loaders.url_playwright.UnstructuredHtmlEvaluator(
remove_selectors: List[str] | None = None,
)[source]#

Evaluate the page HTML content using the unstructured library.

Initialize UnstructuredHtmlEvaluator.

Methods

__init__([remove_selectors])

Initialize UnstructuredHtmlEvaluator.

evaluate(page, browser, response)

Synchronously process the HTML content of the page.

evaluate_async(page, browser, response)

Asynchronously process the HTML content of the page.

Parameters:

remove_selectors (List[str] | None)

__init__(
remove_selectors: List[str] | None = None,
)[source]#

Initialize UnstructuredHtmlEvaluator.

Parameters:

remove_selectors (List[str] | None)

evaluate(
page: Page,
browser: Browser,
response: Response,
) str[source]#

Synchronously process the HTML content of the page.

Parameters:
  • page (Page)

  • browser (Browser)

  • response (Response)

Return type:

str

async evaluate_async(
page: AsyncPage,
browser: AsyncBrowser,
response: AsyncResponse,
) str[source]#

Asynchronously process the HTML content of the page.

Parameters:
  • page (AsyncPage)

  • browser (AsyncBrowser)

  • response (AsyncResponse)

Return type:

str