Automatically evaluate your LLM logs in production, monitor performance and detect issues.
Agent Run
or LLM Generation
, it’s the target to evaluate.Online Evals
page and click on the +
button in the upper right corner of the table.
Create Online Eval
Online Eval Scores Distribution
Log
column will show the error message.Score
creation APIs with all fields exposed.
If your metrics are code-based or combine LLM calls with arithmetic operations, like Ragas, you can
directly use the SDKs to create scores from your application code.
Step
or a Generation
object.Score
on a Thread
is not well-defined at this stage.