Skip to main content

Ollama

Ollama allows you to run open-source large language models, such as Llama 2, locally.

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.

It optimizes setup and configuration details, including GPU usage.

For a complete list of supported models and model variants, see the Ollama model library.

Setup

First, follow these instructions to set up and run a local Ollama instance:

  • Download and install Ollama onto the available supported platforms (including Windows Subsystem for Linux)
  • Fetch available LLM model via ollama pull <name-of-model>
    • View a list of available models via the model library
    • e.g., ollama pull llama3
  • This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.

On Mac, the models will be download to ~/.ollama/models

On Linux (or WSL), the models will be stored at /usr/share/ollama/.ollama/models

  • Specify the exact version of the model of interest as such ollama pull vicuna:13b-v1.5-16k-q4_0 (View the various tags for the Vicuna model in this instance)
  • To view all pulled models, use ollama list
  • To chat directly with a model from the command line, use ollama run <name-of-model>
  • View the Ollama documentation for more commands. Run ollama help in the terminal to see available commands too.

Usage

You can see a full list of supported parameters on the API reference page.

If you are using a LLaMA chat model (e.g., ollama pull llama3) then you can use the ChatOllama interface.

This includes special tokens for system message and user input.

Interacting with Models

Here are a few ways to interact with pulled local models

directly in the terminal:

  • All of your local models are automatically served on localhost:11434
  • Run ollama run <name-of-model> to start interacting via the command line directly

via an API

Send an application/json request to the API endpoint of Ollama to interact.

curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt":"Why is the sky blue?"
}'

See the Ollama API documentation for all endpoints.

via LangChain

See a typical basic example of using Ollama chat model in your LangChain application.

from langchain_community.llms import Ollama

llm = Ollama(model="llama3")

llm.invoke("Tell me a joke")

API Reference:

"Here's one:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!\n\nHope that made you smile! Do you want to hear another one?"

To stream tokens, use the .stream(...) method:

query = "Tell me a joke"

for chunks in llm.stream(query):
print(chunks)


S
ure
,
here
'
s
one
:




Why
don
'
t
scient
ists
trust
atoms
?


B
ecause
they
make
up
everything
!




I
hope
you
found
that
am
using
!
Do
you
want
to
hear
another
one
?

To learn more about the LangChain Expressive Language and the available methods on an LLM, see the LCEL Interface

Multi-modal

Ollama has support for multi-modal LLMs, such as bakllava and llava.

ollama pull bakllava

Be sure to update Ollama so that you have the most recent version to support multi-modal.

from langchain_community.llms import Ollama

bakllava = Ollama(model="bakllava")

API Reference:

import base64
from io import BytesIO

from IPython.display import HTML, display
from PIL import Image


def convert_to_base64(pil_image):
"""
Convert PIL images to Base64 encoded strings

:param pil_image: PIL image
:return: Re-sized Base64 string
"""

buffered = BytesIO()
pil_image.save(buffered, format="JPEG") # You can change the format if needed
img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
return img_str


def plt_img_base64(img_base64):
"""
Display base64 encoded string as image

:param img_base64: Base64 string
"""
# Create an HTML img tag with the base64 string as the source
image_html = f'<img src="data:image/jpeg;base64,{img_base64}" />'
# Display the image by rendering the HTML
display(HTML(image_html))


file_path = "../../../static/img/ollama_example_img.jpg"
pil_image = Image.open(file_path)
image_b64 = convert_to_base64(pil_image)
plt_img_base64(image_b64)
llm_with_image_context = bakllava.bind(images=[image_b64])
llm_with_image_context.invoke("What is the dollar based gross retention rate:")
'90%'

Help us out by providing feedback on this documentation page: