`rl_chain`#

RL (Reinforcement Learning) Chain leverages the Vowpal Wabbit (VW) models for reinforcement learning with a context, with the goal of modifying the prompt before the LLM call.

[Vowpal Wabbit](https://vowpalwabbit.org/) provides fast, efficient, and flexible online machine learning techniques for reinforcement learning, supervised learning, and more.

Classes

`rl_chain.base.AutoSelectionScorer`	Auto selection scorer.
`rl_chain.base.Embedder`(args, *kwargs)	Abstract class to represent an embedder.
`rl_chain.base.Event`(inputs[, selected])	Abstract class to represent an event.
`rl_chain.base.Policy`(**kwargs)	Abstract class to represent a policy.
`rl_chain.base.RLChain`	Chain that leverages the Vowpal Wabbit (VW) model as a learned policy for reinforcement learning.
`rl_chain.base.Selected`()	Abstract class to represent the selected item.
`rl_chain.base.SelectionScorer`	Abstract class to grade the chosen selection or the response of the llm.
`rl_chain.base.VwPolicy`(model_repo, vw_cmd, ...)	Vowpal Wabbit policy.
`rl_chain.metrics.MetricsTrackerAverage`(step)	Metrics Tracker Average.
`rl_chain.metrics.MetricsTrackerRollingWindow`(...)	Metrics Tracker Rolling Window.
`rl_chain.model_repository.ModelRepository`(folder)	Model Repository.
`rl_chain.pick_best_chain.PickBest`	Chain that leverages the Vowpal Wabbit (VW) model for reinforcement learning with a context, with the goal of modifying the prompt before the LLM call.
`rl_chain.pick_best_chain.PickBestEvent`(...)	Event class for PickBest chain.
`rl_chain.pick_best_chain.PickBestFeatureEmbedder`(...)	Embed the BasedOn and ToSelectFrom inputs into a format that can be used by the learning policy.
`rl_chain.pick_best_chain.PickBestRandomPolicy`(...)	Random policy for PickBest chain.
`rl_chain.pick_best_chain.PickBestSelected`([...])	Selected class for PickBest chain.
`rl_chain.vw_logger.VwLogger`(path)	Vowpal Wabbit custom logger.

Functions

`rl_chain.base.BasedOn`(anything)	Wrap a value to indicate that it should be based on.
`rl_chain.base.Embed`(anything[, keep])	Wrap a value to indicate that it should be embedded.
`rl_chain.base.EmbedAndKeep`(anything)	Wrap a value to indicate that it should be embedded and kept.
`rl_chain.base.ToSelectFrom`(anything)	Wrap a value to indicate that it should be selected from.
`rl_chain.base.get_based_on_and_to_select_from`(inputs)	Get the BasedOn and ToSelectFrom from the inputs.
`rl_chain.base.parse_lines`(parser, input_str)	Parse the input string into a list of examples.
`rl_chain.base.prepare_inputs_for_autoembed`(inputs)	Prepare the inputs for auto embedding.
`rl_chain.helpers.embed`(to_embed, model[, ...])	Embed the actions or context using the SentenceTransformer model (or a model that has an encode function).
`rl_chain.helpers.embed_dict_type`(item, model)	Embed a dictionary item.
`rl_chain.helpers.embed_list_type`(item, model)	Embed a list item.
`rl_chain.helpers.embed_string_type`(item, model)	Embed a string or an _Embed object.
`rl_chain.helpers.is_stringtype_instance`(item)	Check if an item is a string.
`rl_chain.helpers.stringify_embedding`(embedding)	Convert an embedding to a string.

rl_chain#

`rl_chain`#