But the NATURAL utility function would reward it for being right on average, I think. We could also have the AI adjust the reward based on how hard the question is for a fixed weaker AI, so it wouldn’t prefer easy questions.
We could also have the AI adjust the reward based on how hard the question is for a fixed weaker AI, so it wouldn’t prefer easy questions.
You mean, so that it will generate a parametrized question which maximizes the ratio of reward to computational resources spent.
Sorry, I deleted the post without realizing you replied to it. I realized it had problems and decided to give it more thought for now.
Current theme: default
Less Wrong (text)
Less Wrong (link)
But the NATURAL utility function would reward it for being right on average, I think. We could also have the AI adjust the reward based on how hard the question is for a fixed weaker AI, so it wouldn’t prefer easy questions.
You mean, so that it will generate a parametrized question which maximizes the ratio of reward to computational resources spent.
Sorry, I deleted the post without realizing you replied to it. I realized it had problems and decided to give it more thought for now.