BlackboardLoader#

class langchain_community.document_loaders.blackboard.BlackboardLoader( blackboard_course_url: str, bbrouter: str, load_all_recursively: bool = True, basic_auth: Tuple[str, str] | None = None, cookies: dict | None = None, continue_on_failure: bool = False, show_progress: bool = True, )[source]#

Load a Blackboard course.

This loader is not compatible with all Blackboard courses. It is only compatible with courses that use the new Blackboard interface. To use this loader, you must have the BbRouter cookie. You can get this cookie by logging into the course and then copying the value of the BbRouter cookie from the browser’s developer tools.

Example

from langchain_community.document_loaders import BlackboardLoader

loader = BlackboardLoader(
    blackboard_course_url="https://blackboard.example.com/webapps/blackboard/execute/announcement?method=search&context=course_entry&course_id=_123456_1",
    bbrouter="expires:12345...",
)
documents = loader.load()

Initialize with blackboard course url.

The BbRouter cookie is required for most blackboard courses.

Parameters:

blackboard_course_url (str) – Blackboard course url.
bbrouter (str) – BbRouter cookie.
load_all_recursively (bool) – If True, load all documents recursively.
basic_auth (Tuple[str, str] | None) – Basic auth credentials.
cookies (dict | None) – Cookies.
continue_on_failure (bool) – whether to continue loading the sitemap if an error occurs loading a url, emitting a warning instead of raising an exception. Setting this to True makes the loader more robust, but also may result in missing data. Default: False
show_progress (bool) – whether to show a progress bar while loading. Default: True

Raises:

ValueError – If blackboard course url is invalid.

Attributes

web_path

Methods

`__init__`(blackboard_course_url, bbrouter[, ...])	Initialize with blackboard course url.
`alazy_load`()	Async lazy load text from the url(s) in web_path.
`aload`()
`ascrape_all`(urls[, parser])	Async fetch all urls, then return soups for all results.
`check_bs4`()	Check if BeautifulSoup4 is installed.
`download`(path)	Download a file from an url.
`fetch_all`(urls)	Fetch all urls concurrently with rate limiting.
`lazy_load`()	Lazy load text from the url(s) in web_path.
`load`()	Load data into Document objects.
`load_and_split`([text_splitter])	Load Documents and split into chunks.
`parse_filename`(url)	Parse the filename from an url.
`scrape`([parser])	Scrape data from webpage and return it in BeautifulSoup format.
`scrape_all`(urls[, parser])	Fetch all urls, then return soups for all results.

__init__( blackboard_course_url: str, bbrouter: str, load_all_recursively: bool = True, basic_auth: Tuple[str, str] | None = None, cookies: dict | None = None, continue_on_failure: bool = False, show_progress: bool = True, )[source]#

Initialize with blackboard course url.

The BbRouter cookie is required for most blackboard courses.

Parameters:

blackboard_course_url (str) – Blackboard course url.
bbrouter (str) – BbRouter cookie.
load_all_recursively (bool) – If True, load all documents recursively.
basic_auth (Tuple[str, str] | None) – Basic auth credentials.
cookies (dict | None) – Cookies.
continue_on_failure (bool) – whether to continue loading the sitemap if an error occurs loading a url, emitting a warning instead of raising an exception. Setting this to True makes the loader more robust, but also may result in missing data. Default: False
show_progress (bool) – whether to show a progress bar while loading. Default: True

Raises:

ValueError – If blackboard course url is invalid.

async alazy_load() → AsyncIterator[Document]#

Async lazy load text from the url(s) in web_path.

Return type:: AsyncIterator[Document]

aload() → List[Document]#

Deprecated since version 0.3.14: See API reference for updated usage: https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.web_base.WebBaseLoader.html It will not be removed until langchain-community==1.0.

Load text from the urls in web_path async into Documents.

Return type:: List[Document]

async ascrape_all( urls: List[str], parser: str | None = None, ) → List[Any]#

Async fetch all urls, then return soups for all results.

Parameters:

urls (List[str])
parser (str | None)

Return type:

List[Any]

check_bs4() → None[source]#

Check if BeautifulSoup4 is installed.

Raises:: ImportError – If BeautifulSoup4 is not installed.
Return type:: None

download(path: str) → None[source]#

Download a file from an url.

Parameters:: path (str) – Path to the file.
Return type:: None

async fetch_all( urls: List[str], ) → Any#

Fetch all urls concurrently with rate limiting.

Parameters:: urls (List[str])
Return type:: Any

lazy_load() → Iterator[Document]#

Lazy load text from the url(s) in web_path.

Return type:: Iterator[Document]

load() → List[Document][source]#

Load data into Document objects.

Returns:: List of Documents.
Return type:: List[Document]

load_and_split( text_splitter: TextSplitter | None = None, ) → list[Document]#

Load Documents and split into chunks. Chunks are returned as Documents.

Do not override this method. It should be considered to be deprecated!

Parameters:: text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.
Returns:: List of Documents.
Return type:: list[Document]

parse_filename(url: str) → str[source]#

Parse the filename from an url.

Parameters:: url (str) – Url to parse the filename from.
Returns:: The filename.
Return type:: str

scrape( parser: str | None = None, ) → Any#

Scrape data from webpage and return it in BeautifulSoup format.

Parameters:: parser (str | None)
Return type:: Any

scrape_all( urls: List[str], parser: str | None = None, ) → List[Any]#

Fetch all urls, then return soups for all results.

Parameters:

urls (List[str])
parser (str | None)

Return type:

List[Any]

Examples using BlackboardLoader

Blackboard