Stuart_Armstrong comments on Fake Utility Functions

Stuart_Armstrong 8 Dec 2007 8:04 UTC
0 points
+10 points for spotting the giant cheesecake fallacy in this criticism. This AI has no desire to rule the world.

You’re claiming that we know enough about the AI motivations to know that ruling the world will never be an instrumental value for it? (Terminal values are easier to guard against). If we have an AI that will never make any attempt to rule the world (or strongly influence it, or paternalisticaly guide it, or other synonyms), then congratulations! You’ve sucessfuly built a harmless AI. But you want it to help design a Friendly AI, knowing that the Friendly AI will intervene in the world? If it accepts to do that, while refusing to intervene in the world in other ways, then it is already friendly.

I’d need muuuuuch more thought out evidence before I could be persuaded that some variant on this plan is a good idea.

If anyone wants to continue the discussion, my email is dragondreaming@gmail.com (Eliezer has asked that we not use this comment section for talking about general AI stuff).