GitHubIssuesLoader#

class langchain_community.document_loaders.github.GitHubIssuesLoader[source]#

Bases: BaseGitHubLoader

Load issues of a GitHub repository.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

param access_token: str [Required]#

Personal access token - see settings/tokens

param assignee: str | None = None#

Filter on assigned user. Pass ‘none’ for no user and ‘*’ for any user.

param creator: str | None = None#

Filter on the user that created the issue.

param direction: Literal['asc', 'desc'] | None = None#

The direction to sort the results by. Can be one of: ‘asc’, ‘desc’.

param github_api_url: str = 'https://api.github.com'#

URL of GitHub API

param include_prs: bool = True#

If True include Pull Requests in results, otherwise ignore them.

param labels: List[str] | None = None#

Label names to filter one. Example: bug,ui,@high.

param mentioned: str | None = None#

Filter on a user that’s mentioned in the issue.

param milestone: int | Literal['*', 'none'] | None = None#

If integer is passed, it should be a milestone’s number field. If the string ‘*’ is passed, issues with any milestone are accepted. If the string ‘none’ is passed, issues without milestones are returned.

param page: int | None = None#

The page number for paginated results. Defaults to 1 in the GitHub API.

param per_page: int | None = None#

Number of items per page. Defaults to 30 in the GitHub API.

param repo: str [Required]#

Name of repository

param since: str | None = None#

Only show notifications updated after the given time. This is a timestamp in ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ.

param sort: Literal['created', 'updated', 'comments'] | None = None#

What to sort results by. Can be one of: ‘created’, ‘updated’, ‘comments’. Default is ‘created’.

param state: Literal['open', 'closed', 'all'] | None = None#

Filter on issue state. Can be one of: ‘open’, ‘closed’, ‘all’.

async alazy_load() AsyncIterator[Document]#

A lazy loader for Documents.

Return type:

AsyncIterator[Document]

async aload() list[Document]#

Load data into Document objects.

Return type:

list[Document]

lazy_load() Iterator[Document][source]#

Get issues of a GitHub repository.

Returns:

  • page_content

  • metadata
    • url

    • title

    • creator

    • created_at

    • last_update_time

    • closed_time

    • number of comments

    • state

    • labels

    • assignee

    • assignees

    • milestone

    • locked

    • number

    • is_pull_request

Return type:

A list of Documents with attributes

load() list[Document]#

Load data into Document objects.

Return type:

list[Document]

load_and_split(text_splitter: TextSplitter | None = None) list[Document]#

Load Documents and split into chunks. Chunks are returned as Documents.

Do not override this method. It should be considered to be deprecated!

Parameters:

text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.

Returns:

List of Documents.

Return type:

list[Document]

parse_issue(issue: dict) Document[source]#

Create Document objects from a list of GitHub issues.

Parameters:

issue (dict)

Return type:

Document

property headers: Dict[str, str]#
property query_params: str#

Create query parameters for GitHub API.

property url: str#

Create URL for GitHub API.

Examples using GitHubIssuesLoader