Leaderboards
Top and bottom performers across all dimensions
Understanding the Scores
Critical Errors
Factually incorrect statements about the film. These are the most serious mistakes.
Critical Omissions
Important information that was completely missing from the response.
Imprecisions
Partially correct or vaguely stated information that could mislead.
Notable Gaps
Minor missing details that would have improved the response.
((Critical Errors + Critical Omissions) × 5) + Imprecisions + Notable Gaps
Lower scores indicate more accurate responses. A score of 0 means perfect accuracy.
AI Model Rankings
Compare models by different scoring methods
Ranked by weighted score: Critical errors and omissions are penalized 5x more than imprecisions and gaps.
Ranked by average critical errors only. This shows which models make the fewest factual mistakes.
Ranked by total error count (all types combined, unweighted). This treats all errors equally.
Film Accuracy Rankings
Films AI handles most accurately (by weighted score)
Most Challenging Questions
Questions that cause the most AI errors (click to view report)