Eval Run

class mixedvoices.evaluation.eval_run.EvalRun(run_id: str, project_id: str, version_id: str, eval_id: str, agent_prompt: str, metric_names: List[str], test_cases: List[str], verbose: bool = True, created_at: int | None = None, eval_agents: List[EvalAgent] | None = None, started: bool = False, ended: bool = False, error: str | None = None, last_updated: int | None = None)[source]

Bases: object

Tracks a single run of Evaluator

property eval_id: str

Get the id of the Evaluator

property id: str

Get the id of the EvalRun

property info

Get the info of the run as a dictionary

property project_id: str

Get the name of the Project

property results: List[dict]

Returns the results of the run as a list of dictionaries each representing a test case’s results

run(agent_class: Type[BaseAgent], agent_starts: bool | None, **kwargs)[source]

Runs the evaluator and saves the results.

Parameters:
  • agent_class (Type[BaseAgent]) – The agent class to evaluate

  • agent_starts (Optional[bool]) – Whether the agent starts the conversation or not. If True, the agent starts the conversation If False, the evaluator starts the conversation If None, random choice

  • **kwargs – Keyword arguments to pass to the agent class

property status

Returns the status of the run as a string

property version_id: str

Get the name of the Version