Response Length vs Errors

Spearman's ρ 0
n = p = 1

Insufficient data

Film Age vs Errors

Spearman's ρ -0.0112
n = 2490 p = 0.5749

Film age has little impact on AI accuracy

Average Error Score by Decade

1920s
4.0
1930s
7.7
1940s
9.8
1950s
6.7
1960s
7.7
1970s
6.4
1980s
6.7
1990s
7.6
2000s
6.8
2010s
6.4
2020s
7.7

Search Grounding Impact

With Search 4.04 1711 reports
vs
Without Search 13.59 816 reports
↓ 70.3% Reduction in errors
p = 0.0 Significant

Search grounding reduces errors by 70.3% (statistically significant)