PairwiseStringEvaluator#
- class langchain.evaluation.schema.PairwiseStringEvaluator[source]#
Compare the output of two models (or two outputs of the same model).
Attributes
requires_input
Whether this evaluator requires an input string.
requires_reference
Whether this evaluator requires a reference label.
Methods
__init__
()aevaluate_string_pairs
(*, prediction, ...[, ...])Asynchronously evaluate the output string pairs.
evaluate_string_pairs
(*, prediction, ...[, ...])Evaluate the output string pairs.
- __init__()#
- async aevaluate_string_pairs(*, prediction: str, prediction_b: str, reference: str | None = None, input: str | None = None, **kwargs: Any) dict [source]#
Asynchronously evaluate the output string pairs.
- Parameters:
prediction (str) – The output string from the first model.
prediction_b (str) – The output string from the second model.
reference (Optional[str], optional) – The expected output / reference string.
input (Optional[str], optional) – The input string.
kwargs (Any) – Additional keyword arguments, such as callbacks and optional reference strings.
- Returns:
A dictionary containing the preference, scores, and/or other information.
- Return type:
dict
- evaluate_string_pairs(*, prediction: str, prediction_b: str, reference: str | None = None, input: str | None = None, **kwargs: Any) dict [source]#
Evaluate the output string pairs.
- Parameters:
prediction (str) – The output string from the first model.
prediction_b (str) – The output string from the second model.
reference (Optional[str], optional) – The expected output / reference string.
input (Optional[str], optional) – The input string.
kwargs (Any) – Additional keyword arguments, such as callbacks and optional reference strings.
- Returns:
A dictionary containing the preference, scores, and/or other information.
- Return type:
dict