Posts

Comments

Comment by Theresa Barton (theresa-barton) on GPT-4 · 2023-03-22T18:23:57.158Z · LW · GW

I think performance on AP english might be a quirk of how they dealt with dataset contamination. English and Literature exams showed anomalous amount of contamination (lots of the famous texts are online and referenced elsewhere) so they threw out most of the questions, leading to a null conclusion about performance.