JsonEqualityEvaluator#

class langchain.evaluation.parsing.base.JsonEqualityEvaluator(operator: Callable | None = None, **kwargs: Any)[source]#

Evaluate whether the prediction is equal to the reference after: parsing both as JSON.
This evaluator checks if the prediction, after parsing as JSON, is equal: to the reference,

which is also parsed as JSON. It does not require an input string.

Parameters:

operator (Callable | None) –
kwargs (Any) –

requires_input#

Whether this evaluator requires an input string. Always False.

Type:: bool

requires_reference#

Whether this evaluator requires a reference string. Always True.

Type:: bool

evaluation_name#

The name of the evaluation metric. Always “parsed_equality”.

Type:: str

Examples

>>> evaluator = JsonEqualityEvaluator()
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 1}')
{'score': True}
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 2}')
{'score': False}

>>> evaluator = JsonEqualityEvaluator(operator=lambda x, y: x['a'] == y['a'])
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 1}')
{'score': True}
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 2}')
{'score': False}

Attributes

`evaluation_name`	The name of the evaluation.
`requires_input`	Whether this evaluator requires an input string.
`requires_reference`	Whether this evaluator requires a reference label.

Methods

`__init__`([operator])
`aevaluate_strings`(*, prediction[, ...])	Asynchronously evaluate Chain or LLM output, based on optional input and label.
`evaluate_strings`(*, prediction[, reference, ...])	Evaluate Chain or LLM output, based on optional input and label.

__init__(operator: Callable | None = None, **kwargs: Any) → None[source]#

Parameters:

operator (Callable | None) –
kwargs (Any) –

Return type:

None

async aevaluate_strings(*, prediction: str, reference: str | None = None, input: str | None = None, **kwargs: Any) → dict#

Asynchronously evaluate Chain or LLM output, based on optional input and label.

Parameters:

prediction (str) – The LLM or chain prediction to evaluate.
reference (Optional[str], optional) – The reference label to evaluate against.
input (Optional[str], optional) – The input to consider during evaluation.
kwargs (Any) – Additional keyword arguments, including callbacks, tags, etc.

Returns:

The evaluation results containing the score or value.

Return type:

dict

evaluate_strings(*, prediction: str, reference: str | None = None, input: str | None = None, **kwargs: Any) → dict#

Evaluate Chain or LLM output, based on optional input and label.

Parameters:

prediction (str) – The LLM or chain prediction to evaluate.
reference (Optional[str], optional) – The reference label to evaluate against.
input (Optional[str], optional) – The input to consider during evaluation.
kwargs (Any) – Additional keyword arguments, including callbacks, tags, etc.

Returns:

The evaluation results containing the score or value.

Return type:

dict