Skip to main content

Apify

Apify is a cloud platform for web scraping and data extraction, which provides an ecosystem of more than a thousand ready-made apps called Actors for various scraping, crawling, and extraction use cases.

Apify Actors

This integration enables you run Actors on the Apify platform and load their results into LangChain to feed your vector indexes with documents and data from the web, e.g. to generate answers from websites with documentation, blogs, or knowledge bases.

Installation and Setup​

  • Install the Apify API client for Python with pip install apify-client
  • Get your Apify API token and either set it as an environment variable (APIFY_API_TOKEN) or pass it to the ApifyWrapper as apify_api_token in the constructor.

Utility​

You can use the ApifyWrapper to run Actors on the Apify platform.

from langchain_community.utilities import ApifyWrapper

API Reference:

For a more detailed walkthrough of this wrapper, see this notebook.

Document loader​

You can also use our ApifyDatasetLoader to get data from Apify dataset.

from langchain_community.document_loaders import ApifyDatasetLoader

API Reference:

For a more detailed walkthrough of this loader, see this notebook.


Help us out by providing feedback on this documentation page: