Friendli

Friendli enhances AI application performance and optimizes cost savings with scalable, efficient deployment options, tailored for high-demand AI workloads.

This tutorial guides you through integrating Friendli with LangChain.

Setup

Ensure the langchain_community and friendli-client are installed.

pip install -U langchain-community friendli-client

import getpass
import os

if "FRIENDLI_TOKEN" not in os.environ:
    os.environ["FRIENDLI_TOKEN"] = getpass.getpass("Friendi Personal Access Token: ")

You can initialize a Friendli chat model with selecting the model you want to use.
The default model is meta-llama-3.1-8b-instruct. You can check the available models at friendli.ai/docs.

from langchain_community.llms.friendli import Friendli

llm = Friendli(model="meta-llama-3.1-8b-instruct", max_tokens=100, temperature=0)

Usage

Frienli supports all methods of LLM including async APIs.

You can use functionality of invoke, batch, generate, and stream.

llm.invoke("Tell me a joke.")

" I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come"

llm.batch(["Tell me a joke.", "Tell me a joke."])

[" I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come",
 " I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come"]

llm.generate(["Tell me a joke.", "Tell me a joke."])

LLMResult(generations=[[Generation(text=" I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come")], [Generation(text=" I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come")]], llm_output={'model': 'meta-llama-3.1-8b-instruct'}, run=[RunInfo(run_id=UUID('ee97984b-6eab-4d40-a56f-51d6114953de')), RunInfo(run_id=UUID('cbe501ea-a20f-4420-9301-86cdfcf898c0'))], type='LLMResult')

for chunk in llm.stream("Tell me a joke."):
    print(chunk, end="", flush=True)

You can also use all functionality of async APIs: ainvoke, abatch, agenerate, and astream.

await llm.ainvoke("Tell me a joke.")

" I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come"

await llm.abatch(["Tell me a joke.", "Tell me a joke."])

[" I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come",
 " I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come"]

await llm.agenerate(["Tell me a joke.", "Tell me a joke."])

LLMResult(generations=[[Generation(text=" I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come")], [Generation(text=" I need a laugh.\nHere's one: Why couldn't the bicycle stand up by itself?\nBecause it was two-tired!\nI hope that made you laugh! Do you want to hear another one? I have a million of 'em! (Okay, maybe not a million, but I have a few more where that came from!) What kind of joke are you in the mood for? A pun, a play on words, or something else? Let me know and I'll try to come")]], llm_output={'model': 'meta-llama-3.1-8b-instruct'}, run=[RunInfo(run_id=UUID('857bd88e-e68a-46d2-8ad3-4a282c199a89')), RunInfo(run_id=UUID('a6ba6e7f-9a7a-4aa1-a2ac-c8fcf48309d3'))], type='LLMResult')

async for chunk in llm.astream("Tell me a joke."):
    print(chunk, end="", flush=True)

LLM conceptual guide
LLM how-to guides

Setup​

Usage​

Related​

Setup

Usage

Related