Funny thought—I wonder if people were created (simulated/evolved/whatever) with a reward/utility function that prefers not to know their reward/utility function.
Is the common ugh field around quantifying our motivation (whether anti-economic sentiment, or just punishing those who explain the reasoning between unpleasant tradeoffs) a mechanism to keep us from goodhearting ourselves?
Funny thought—I wonder if people were created (simulated/evolved/whatever) with a reward/utility function that prefers not to know their reward/utility function.
Is the common ugh field around quantifying our motivation (whether anti-economic sentiment, or just punishing those who explain the reasoning between unpleasant tradeoffs) a mechanism to keep us from goodhearting ourselves?