VertexStringEvaluator#

class langchain_google_vertexai.evaluators.evaluation.VertexStringEvaluator(metric: str, **kwargs)[source]#

Evaluate the perplexity of a predicted string.

Attributes

`evaluation_name`	The name of the evaluation.
`requires_input`	Whether this evaluator requires an input string.
`requires_reference`	Whether this evaluator requires a reference label.

Methods

`__init__`(metric, **kwargs)
`aevaluate_strings`(*, prediction[, ...])	Asynchronously evaluate Chain or LLM output, based on optional input and label.
`evaluate`(examples, predictions, *[, ...])
`evaluate_strings`(*, prediction[, reference, ...])	Evaluate Chain or LLM output, based on optional input and label.

Parameters:: metric (str)

__init__(metric: str, **kwargs)[source]#

Parameters:: metric (str)

async aevaluate_strings(*, prediction: str, reference: str | None = None, input: str | None = None, **kwargs: Any) → dict#

Asynchronously evaluate Chain or LLM output, based on optional input and label.

Parameters:

prediction (str) – The LLM or chain prediction to evaluate.
reference (Optional[str], optional) – The reference label to evaluate against.
input (Optional[str], optional) – The input to consider during evaluation.
**kwargs – Additional keyword arguments, including callbacks, tags, etc.

Returns:

The evaluation results containing the score or value.

Return type:

dict

evaluate(examples: Sequence[Dict[str, str]], predictions: Sequence[Dict[str, str]], *, question_key: str = 'context', answer_key: str = 'reference', prediction_key: str = 'prediction', instruction_key: str = 'instruction', **kwargs: Any) → List[dict][source]#

Parameters:

examples (Sequence[Dict[str, str]])
predictions (Sequence[Dict[str, str]])
question_key (str)
answer_key (str)
prediction_key (str)
instruction_key (str)
kwargs (Any)

Return type:

List[dict]

evaluate_strings(*, prediction: str, reference: str | None = None, input: str | None = None, **kwargs: Any) → dict#

Evaluate Chain or LLM output, based on optional input and label.

Parameters:

prediction (str) – The LLM or chain prediction to evaluate.
reference (Optional[str], optional) – The reference label to evaluate against.
input (Optional[str], optional) – The input to consider during evaluation.
**kwargs – Additional keyword arguments, including callbacks, tags, etc.

Returns:

The evaluation results containing the score or value.

Return type:

dict