Skip to main content

Airbyte Question Answering

This notebook shows how to do question answering over structured data, in this case using the AirbyteStripeLoader.

Vectorstores often have a hard time answering questions that requires computing, grouping and filtering structured data so the high level idea is to use a pandas dataframe to help with these types of questions.

  1. Load data from Stripe using Airbyte. user the record_handler paramater to return a JSON from the data loader.
import os

import pandas as pd
from langchain.agents import AgentType, create_pandas_dataframe_agent
from langchain_community.document_loaders.airbyte import AirbyteStripeLoader
from langchain_openai import ChatOpenAI

stream_name = "customers"
config = {
"client_secret": os.getenv("STRIPE_CLIENT_SECRET"),
"account_id": os.getenv("STRIPE_ACCOUNT_D"),
"start_date": "2023-01-20T00:00:00Z",

def handle_record(record: dict, _id: str):

loader = AirbyteStripeLoader(
data = loader.load()
  1. Pass the data to pandas dataframe.
df = pd.DataFrame(data)
  1. Pass the dataframe df to the create_pandas_dataframe_agent and invoke
agent = create_pandas_dataframe_agent(
ChatOpenAI(temperature=0, model="gpt-4"),
  1. Run the agent
output ="How many rows are there?")