I’d love to claim credit for helping to boost talk about meta-preferences in the zeitgeist (regular plug for Reducing Goodhart).
But sadly, I think if I had actually been influential, people would be more freaking leery of reifying a “True Utility Function” for humans.
I’d love to claim credit for helping to boost talk about meta-preferences in the zeitgeist (regular plug for Reducing Goodhart).
But sadly, I think if I had actually been influential, people would be more freaking leery of reifying a “True Utility Function” for humans.