Skip to main content
Open on GitHub

Tensorlake

Tensorlake is the AI Data Cloud that reliably transforms data from unstructured sources into ingestion-ready formats for AI Applications.

The langchain-tensorlake package provides seamless integration between Tensorlake and LangChain, enabling you to build sophisticated document processing agents with enhanced parsing features, like signature detection.

Tensorlake feature overviewโ€‹

Tensorlake gives you tools to:

  • Extract: Schema-driven structured data extraction to pull out specific fields from documents.
  • Parse: Convert documents to markdown to build RAG/Knowledge Graph systems.
  • Orchestrate: Build programmable workflows for large-scale ingestion and enrichment of Documents, Text, Audio, Video and more.

Learn more at docs.tensorlake.ai


Installationโ€‹

pip install -U langchain-tensorlake

Examplesโ€‹

Follow a full tutorial on how to detect signatures in unstructured documents using the langchain-tensorlake tool.

Or check out this colab notebook for a quick start.


Quick Startโ€‹

1. Set up your environmentโ€‹

You should configure credentials for Tensorlake and OpenAI by setting the following environment variables:

export TENSORLAKE_API_KEY="your-tensorlake-api-key"
export OPENAI_API_KEY = "your-openai-api-key"

Get your Tensorlake API key from the Tensorlake Cloud Console. New users get 100 free credits.

2. Import necessary packagesโ€‹

from langchain_tensorlake import document_markdown_tool
from langgraph.prebuilt import create_react_agent
import asyncio
import os

3. Build a Signature Detection Agentโ€‹

async def main(question):
# Create the agent with the Tensorlake tool
agent = create_react_agent(
model="openai:gpt-4o-mini",
tools=[document_markdown_tool],
prompt=(
"""
I have a document that needs to be parsed. \n\nPlease parse this document and answer the question about it.
"""
),
name="real-estate-agent",
)

# Run the agent
result = await agent.ainvoke({"messages": [{"role": "user", "content": question}]})

# Print the result
print(result["messages"][-1].content)

Note: We highly recommend using openai as the agent model to ensure the agent sets the right parsing parameters

4. Example Usageโ€‹

# Define the path to the document to be parsed
path = "path/to/your/document.pdf"

# Define the question to be asked and create the agent
question = f"What contextual information can you extract about the signatures in my document found at {path}?"

if __name__ == "__main__":
asyncio.run(main(question))

Need help?โ€‹

Reach out to us on Slack or on the package repository on GitHub directly.