JsonEqualityEvaluator#

class langchain.evaluation.parsing.base.JsonEqualityEvaluator(operator: Callable | None = None, **kwargs: Any)[source]#
Evaluate whether the prediction is equal to the reference after

parsing both as JSON.

This evaluator checks if the prediction, after parsing as JSON, is equal

to the reference,

which is also parsed as JSON. It does not require an input string.

Parameters:
  • operator (Callable | None)

  • kwargs (Any)

requires_input#

Whether this evaluator requires an input string. Always False.

Type:

bool

requires_reference#

Whether this evaluator requires a reference string. Always True.

Type:

bool

evaluation_name#

The name of the evaluation metric. Always “parsed_equality”.

Type:

str

Examples

>>> evaluator = JsonEqualityEvaluator()
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 1}')
{'score': True}
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 2}')
{'score': False}
>>> evaluator = JsonEqualityEvaluator(operator=lambda x, y: x['a'] == y['a'])
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 1}')
{'score': True}
>>> evaluator.evaluate_strings('{"a": 1}', reference='{"a": 2}')
{'score': False}

Attributes

evaluation_name

The name of the evaluation.

requires_input

Whether this evaluator requires an input string.

requires_reference

Whether this evaluator requires a reference label.

Methods

__init__([operator])

aevaluate_strings(*, prediction[, ...])

Asynchronously evaluate Chain or LLM output, based on optional input and label.

evaluate_strings(*, prediction[, reference, ...])

Evaluate Chain or LLM output, based on optional input and label.

__init__(operator: Callable | None = None, **kwargs: Any) None[source]#
Parameters:
  • operator (Callable | None)

  • kwargs (Any)

Return type:

None

async aevaluate_strings(*, prediction: str, reference: str | None = None, input: str | None = None, **kwargs: Any) dict#

Asynchronously evaluate Chain or LLM output, based on optional input and label.

Parameters:
  • prediction (str) – The LLM or chain prediction to evaluate.

  • reference (Optional[str], optional) – The reference label to evaluate against.

  • input (Optional[str], optional) – The input to consider during evaluation.

  • kwargs (Any) – Additional keyword arguments, including callbacks, tags, etc.

Returns:

The evaluation results containing the score or value.

Return type:

dict

evaluate_strings(*, prediction: str, reference: str | None = None, input: str | None = None, **kwargs: Any) dict#

Evaluate Chain or LLM output, based on optional input and label.

Parameters:
  • prediction (str) – The LLM or chain prediction to evaluate.

  • reference (Optional[str], optional) – The reference label to evaluate against.

  • input (Optional[str], optional) – The input to consider during evaluation.

  • kwargs (Any) – Additional keyword arguments, including callbacks, tags, etc.

Returns:

The evaluation results containing the score or value.

Return type:

dict