LLM as a Judge: Scaling AI Evaluation Strategies

Zahra Ashktorab
IBM Technology YouTube, 2025
How does AI evaluate its own outputs? Zahra Ashktorab explains how LLM as a judge can scale and refine evaluations with strategies like direct assessment and pairwise comparison. Discover how to tackle biases like verbosity and positional bias for accurate, scalable frameworks.