Skip to main content


Baseten provides all the infrastructure you need to deploy and serve ML models performantly, scalably, and cost-efficiently.

As a model inference platform, Baseten is a Provider in the LangChain ecosystem. The Baseten integration currently implements a single Component, LLMs, but more are planned!

Baseten lets you run both open source models like Llama 2 or Mistral and run proprietary or fine-tuned models on dedicated GPUs. If you're used to a provider like OpenAI, using Baseten has a few differences:

  • Rather than paying per token, you pay per minute of GPU used.
  • Every model on Baseten uses Truss, our open-source model packaging framework, for maximum customizability.
  • While we have some OpenAI ChatCompletions-compatible models, you can define your own I/O spec with Truss.

You can learn more about Baseten in our docs or read on for LangChain-specific info.

Setup: LangChain + Baseten​

You'll need two things to use Baseten models with LangChain:

Export your API key to your as an environment variable called BASETEN_API_KEY.

export BASETEN_API_KEY="paste_your_api_key_here"

Component guide: LLMs​

Baseten integrates with LangChain through the LLM component, which provides a standardized and interoperable interface for models that are deployed on your Baseten workspace.

You can deploy foundation models like Mistral and Llama 2 with one click from the Baseten model library or if you have your own model, deploy it with Truss.

In this example, we'll work with Mistral 7B. Deploy Mistral 7B here and follow along with the deployed model's ID, found in the model dashboard.

To use this module, you must:

  • Export your Baseten API key as the environment variable BASETEN_API_KEY
  • Get the model ID for your model from your Baseten dashboard
  • Identify the model deployment ("production" for all model library models)

Learn more about model IDs and deployments.

Production deployment (standard for model library models)

from langchain_community.llms import Baseten

mistral = Baseten(model="MODEL_ID", deployment="production")
mistral("What is the Mistral wind?")

Development deployment

from langchain_community.llms import Baseten

mistral = Baseten(model="MODEL_ID", deployment="development")
mistral("What is the Mistral wind?")

Other published deployment

from langchain_community.llms import Baseten

mistral = Baseten(model="MODEL_ID", deployment="DEPLOYMENT_ID")
mistral("What is the Mistral wind?")

Streaming LLM output, chat completions, embeddings models, and more are all supported on the Baseten platform and coming soon to our LangChain integration. Contact us at with any questions about using Baseten with LangChain.