Response Length vs Errors

Spearman's ρ 0
n = p = 1

Insufficient data

Film Age vs Errors

Spearman's ρ -0.0146
n = 1949 p = 0.5182

Film age has little impact on AI accuracy

Average Error Score by Decade

1920s
4.4
1930s
8.1
1940s
7.0
1950s
6.2
1960s
6.5
1970s
5.1
1980s
5.8
1990s
6.4
2000s
5.5
2010s
5.7
2020s
7.0

Search Grounding Impact

With Search 4.12 1483 reports
vs
Without Search 12.25 466 reports
↓ 66.3% Reduction in errors
p = 0.0 Significant

Search grounding reduces errors by 66.3% (statistically significant)