I was hoping that he meant some concrete examples but did not elaborate on this due this being letter in magazine and not a blog post. The only thing that comes to my mind in somehow measure unexpected behavior and if bridge some times lead people in circles then it will be definitely cause for concern and reevaluation of used technics.
Well, Evals and that stuff OpenAI did with predicting loss could be a starting point to work in the tables.
But we dont really know, I guess that’s the point EY is trying to make.
I was hoping that he meant some concrete examples but did not elaborate on this due this being letter in magazine and not a blog post. The only thing that comes to my mind in somehow measure unexpected behavior and if bridge some times lead people in circles then it will be definitely cause for concern and reevaluation of used technics.