Your position seems to be one that says this is not something to be worried about/looking at. Can you explain why?
For instance, if it is a desire to train predictive systems to provide accurate information, how is 10% or even 1-2% label noise “fine” under those conditions (if, for example, we could somehow get that number down to 0%)?
It seems like he’s mainly responding to the implication that this means MMLU is “broken”. Label noise can be both suboptimal and also much less important than this post’s title suggests.
Your position seems to be one that says this is not something to be worried about/looking at. Can you explain why?
For instance, if it is a desire to train predictive systems to provide accurate information, how is 10% or even 1-2% label noise “fine” under those conditions (if, for example, we could somehow get that number down to 0%)?
It seems like he’s mainly responding to the implication that this means MMLU is “broken”. Label noise can be both suboptimal and also much less important than this post’s title suggests.
I imagine researchers at big labs know this and are correcting these errors as models get good enough for this to matter.