count_tokens_approximately#

langchain_core.messages.utils.count_tokens_approximately(messages: Iterable[BaseMessage | list[str] | tuple[str, str] | str | dict[str, Any]], *, chars_per_token: float = 4.0, extra_tokens_per_message: float = 3.0, count_name: bool = True) int[source]#

Approximate the total number of tokens in messages.

The token count includes stringified message content, role, and (optionally) name. - For AI messages, the token count also includes stringified tool calls. - For tool messages, the token count also includes the tool call ID.

Parameters:
  • messages (Iterable[BaseMessage | list[str] | tuple[str, str] | str | dict[str, Any]]) – List of messages to count tokens for.

  • chars_per_token (float) – Number of characters per token to use for the approximation. Default is 4 (one token corresponds to ~4 chars for common English text). You can also specify float values for more fine-grained control. See more here: https://platform.openai.com/tokenizer

  • extra_tokens_per_message (float) – Number of extra tokens to add per message. Default is 3 (special tokens, including beginning/end of message). You can also specify float values for more fine-grained control. See more here: openai/openai-cookbook

  • count_name (bool) – Whether to include message names in the count. Enabled by default.

Returns:

Approximate number of tokens in the messages.

Return type:

int

Note

This is a simple approximation that may not match the exact token count used by specific models. For accurate counts, use model-specific tokenizers.

Warning

This function does not currently support counting image tokens.

Added in version 0.3.46.