Reading this reminds me of Scott Alexander in his review of “what we owe the future”:
But I’m not sure I want to play the philosophy game. Maybe MacAskill can come up with some clever proof that the commitments I list above imply I have to have my eyes pecked out by angry seagulls or something. If that’s true, I will just not do that, and switch to some other set of axioms. If I can’t find any system of axioms that doesn’t do something terrible when extended to infinity, I will just refuse to extend things to infinity. I can always just keep World A with its 5 billion extremely happy people! I like that one! When the friendly AI asks me if I want to switch from World A to something superficially better, I can ask it “tell me the truth, is this eventually going to result in my eyes being pecked out by seagulls?” and if it answers “yes, I have a series of twenty-eight switches, and each one is obviously better than the one before, and the twenty-eighth is this world except your eyes are getting pecked out by seagulls”, then I will just avoid the first switch. I realize that will intuitively feel like leaving some utility on the table—the first step in the chain just looks so much obviously better than the starting point—but I’m willing to make that sacrifice.
You come up with a brilliant simulation argument as to why the AI shouldn’t just do what’s clearly in his best interests. And maybe the AI is neurotic enough to care. But in all probability, for whatever reason, it doesn’t. And it just goes ahead and turns us into paperclips anyway, ignoring a person running behind it saying “bbbbbbut the simulation argument”.
I’m actually very sympathetic to this comment, I even bring this up in the post as one of the most serious potential objections. Everyone else in these comments seems to have a really strong assumption that the AI will behave optimally, and tries to reason whether the inter-universal trade goes through then. I think it’s quite plausible that the AI is just not terribly thoughtful about this kind of thing and just says “Lol, simulations and acausal trade are not real, I don’t see them”, and kills you.
Reading this reminds me of Scott Alexander in his review of “what we owe the future”:
You come up with a brilliant simulation argument as to why the AI shouldn’t just do what’s clearly in his best interests. And maybe the AI is neurotic enough to care. But in all probability, for whatever reason, it doesn’t. And it just goes ahead and turns us into paperclips anyway, ignoring a person running behind it saying “bbbbbbut the simulation argument”.
I’m actually very sympathetic to this comment, I even bring this up in the post as one of the most serious potential objections. Everyone else in these comments seems to have a really strong assumption that the AI will behave optimally, and tries to reason whether the inter-universal trade goes through then. I think it’s quite plausible that the AI is just not terribly thoughtful about this kind of thing and just says “Lol, simulations and acausal trade are not real, I don’t see them”, and kills you.
No, it is in the AIs best interest to keep humans alive because this gets it more stuff.
Sure it is, if you accept a whole bunch of assumptions. Or it could just not do that.
You said “shouldn’t just do what’s clearly in his best interests”, I was responding to that.