In some cases I agree, for example it doesn’t matter if GPT4 is a stochastic parrot or capable of deeper reasoning as long as it is useful to whatever need we have.
Two out of the five metrics are predicting the future, so it is an important part of knowing who is right, but I don’t think that is all we need? If we have other factors that also correlates with being correct, why not add those in?
Also, I don’t see where we risk Goodharting? Which of the metrics do you see being gamed, without a significantly increased chance of being correct also being increase?
Why pay mind to what’s correlated with being right, when you have the option of just seeing who’s right?
I’m arguing that being right is the same as “holding greater predictive power”, so any conversation that’s not geared toward “what’s the difference in our predictions?” is not about being right, but rather about something else, like “Do I fit the profile of someone who would be right” / “Am I generally intelligent” / “Am I arguing in good faith” etc.
In some cases I agree, for example it doesn’t matter if GPT4 is a stochastic parrot or capable of deeper reasoning as long as it is useful to whatever need we have.
Two out of the five metrics are predicting the future, so it is an important part of knowing who is right, but I don’t think that is all we need? If we have other factors that also correlates with being correct, why not add those in?
Also, I don’t see where we risk Goodharting? Which of the metrics do you see being gamed, without a significantly increased chance of being correct also being increase?
Why pay mind to what’s correlated with being right, when you have the option of just seeing who’s right?
I’m arguing that being right is the same as “holding greater predictive power”, so any conversation that’s not geared toward “what’s the difference in our predictions?” is not about being right, but rather about something else, like “Do I fit the profile of someone who would be right” / “Am I generally intelligent” / “Am I arguing in good faith” etc.