If Bob cares about cute puppies, then Bob will use his monstrous intelligence to bend the energy of the universe towards cute puppies. And love and flowers and sunrises and babies and cake.
I follow you. It does resolve my question of whether or not rationality + power necessarily involves a terrible outcomes. I had asked the question of whether or not a perfect rationalist given enough time and resources would become perfectly selfish. I believe I understand the answer as no.
Matt_Simpson gave a similar answer:
Suppose a rational agent has the ability to modify their own utility function (i.e. preferences) - maybe an AI that can rewrite its own source code. Would it do it? Well, only if it maximizes that agent’s utility function. In other words, a rational agent will change its utility function if and only if it maximizes expected utility according to that same utility function
If Bob’s utility function is puppies, babies and cakes, then he would not change his utility function for a universe with out these things. Do I have the right idea now?
I follow you. It does resolve my question of whether or not rationality + power necessarily involves a terrible outcomes. I had asked the question of whether or not a perfect rationalist given enough time and resources would become perfectly selfish. I believe I understand the answer as no.
Indeed. The equation for terrible outcomes is “rationality + power + asshole” (where ‘asshole’ is defined as the vast majority of utility functions, which will value terrible things. The ‘rationality’ part is optional to the extent that you can substitute it with more power. :)
I follow you. It does resolve my question of whether or not rationality + power necessarily involves a terrible outcomes. I had asked the question of whether or not a perfect rationalist given enough time and resources would become perfectly selfish. I believe I understand the answer as no.
Matt_Simpson gave a similar answer:
If Bob’s utility function is puppies, babies and cakes, then he would not change his utility function for a universe with out these things. Do I have the right idea now?
Indeed. The equation for terrible outcomes is “rationality + power + asshole” (where ‘asshole’ is defined as the vast majority of utility functions, which will value terrible things. The ‘rationality’ part is optional to the extent that you can substitute it with more power. :)