Skip to main content
Open In ColabOpen on GitHub

Writer Text Splitter

This notebook provides a quick overview for getting started with Writer's text splitter.

Writer's context-aware splitting endpoint provides intelligent text splitting capabilities for long documents (up to 4000 words). Unlike simple character-based splitting, it preserves the semantic meaning and context between chunks, making it ideal for processing long-form content while maintaining coherence. In langchain-writer, we provide usage of Writer's context-aware splitting endpoint as a LangChain text splitter.

Overviewā€‹

Integration detailsā€‹

ClassPackageLocalSerializableJS supportPackage downloadsPackage latest
WriterTextSplitterlangchain-writerāŒāŒāŒPyPI - DownloadsPyPI - Version

Setupā€‹

The WriterTextSplitter is available in the langchain-writer package:

%pip install --quiet -U langchain-writer

Credentialsā€‹

Sign up for Writer AI Studio to generate an API key (you can follow this Quickstart). Then, set the WRITER_API_KEY environment variable:

import getpass
import os

if not os.getenv("WRITER_API_KEY"):
os.environ["WRITER_API_KEY"] = getpass.getpass("Enter your Writer API key: ")

It's also helpful (but not needed) to set up LangSmith for best-in-class observability. If you wish to do so, you can set the LANGSMITH_TRACING and LANGSMITH_API_KEY environment variables:

# os.environ["LANGSMITH_TRACING"] = "true"
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass()

Instantiationā€‹

Instantiate an instance of WriterTextSplitter with the strategy parameter set to one of the following:

  • llm_split: Uses language model for precise semantic splitting
  • fast_split: Uses heuristic-based approach for quick splitting
  • hybrid_split: Combines both approaches
from langchain_writer.text_splitter import WriterTextSplitter

splitter = WriterTextSplitter(strategy="fast_split")

Usageā€‹

The WriterTextSplitter can be used synchronously or asynchronously.

Synchronous usageā€‹

To use the WriterTextSplitter synchronously, call the split_text method with the text you want to split:

text = """Reeeeeeeeeeeeeeeeeeeeeaally long text you want to divide into smaller chunks. For example you can add a poem multiple times:
Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and Iā€”
I took the one less traveled by,
And that has made all the difference.

Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and Iā€”
I took the one less traveled by,
And that has made all the difference.

Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and Iā€”
I took the one less traveled by,
And that has made all the difference.
"""

chunks = splitter.split_text(text)
chunks

You can print the length of the chunks to see how many chunks were created:

print(len(chunks))

Asynchronous usageā€‹

To use the WriterTextSplitter asynchronously, call the asplit_text method with the text you want to split:

async_chunks = await splitter.asplit_text(text)
async_chunks

Print the length of the chunks to see how many chunks were created:

print(len(async_chunks))

API referenceā€‹

For detailed documentation of all WriterTextSplitter features and configurations head to the API reference.

Additional resourcesā€‹

You can find information about Writer's models (including costs, context windows, and supported input types) and tools in the Writer docs.


Was this page helpful?