This page covers how to use the
GPT4All wrapper within LangChain. The tutorial is divided into two parts: installation and setup, followed by usage with an example.
Installation and Setup#
Install the Python package with
pip install pyllamacpp
Download a GPT4All model and place it in your desired directory
To use the GPT4All wrapper, you need to provide the path to the pre-trained model file and the model’s configuration.
from langchain.llms import GPT4All # Instantiate the model. Callbacks support token-wise streaming model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8) # Generate text response = model("Once upon a time, ")
You can also customize the generation parameters, such as n_predict, temp, top_p, top_k, and others.
To stream the model’s predictions, add in a CallbackManager.
from langchain.llms import GPT4All from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler # There are many CallbackHandlers supported, such as # from langchain.callbacks.streamlit import StreamlitCallbackHandler callbacks = [StreamingStdOutCallbackHandler()] model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8) # Generate text. Tokens are streamed through the callback manager. model("Once upon a time, ", callbacks=callbacks)
You can find links to model file downloads in the pyllamacpp repository.
For a more detailed walkthrough of this, see this notebook