But it’s also relevant that we’re not asking the superintelligence to grant a random wish, we’re asking it for the right to keep something we already have. This seems more easily granted than the random wish, since it doesn’t imply he has to give random amounts of money to everyone.
My preferred analogy would be:
You founded a company that was making $77/year. Bernard launched a hostile takeover, took over the company, then expanded it to make $170 billion/year. You ask him to keep paying you the $77/year as a pension, so that you don’t starve to death.
This seems like a very sympathetic request, such that I expect the real, human Bernard would grant it. I agree this doesn’t necessarily generalize to superintelligences, but that’s Zack’s point—Eliezer should choose a different example.
I interpreted Eliezer as writing from the assumption that the superintelligence(s) in question are in fact not already aligned to maximize whatever it is that humanity needs to survive, but some other goal(s), which diverge from humanity’s interests once implemented.
He explicitly states that the essay’s point is to shoot down a clumsy counterargument (along “it wouldn’t cost the ASI a lot to let us live, so we should assume they’d let us live”). So the context (I interpret) is that such requests, however sympathetic, have not been ingrained into the ASI:s goals. Using a different example would mean he was discussing something different.
That is, “just because it would make a trivial difference from the ASI:s perspective to let humanity thrive, whereas it would make an existential difference from humanity’s perspective, doesn’t mean ASIs will let humanity thrive”, assuming such conditions aren’t already baked into their decision-making.
I think Eliezer spends so much time on working from these premises because he believes 1) an unaligned ASI to be the default outcome of current developments, and 2) that all current attempts at alignment will necessarily fail.
But it’s also relevant that we’re not asking the superintelligence to grant a random wish, we’re asking it for the right to keep something we already have. This seems more easily granted than the random wish, since it doesn’t imply he has to give random amounts of money to everyone.
My preferred analogy would be:
This seems like a very sympathetic request, such that I expect the real, human Bernard would grant it. I agree this doesn’t necessarily generalize to superintelligences, but that’s Zack’s point—Eliezer should choose a different example.
I interpreted Eliezer as writing from the assumption that the superintelligence(s) in question are in fact not already aligned to maximize whatever it is that humanity needs to survive, but some other goal(s), which diverge from humanity’s interests once implemented.
He explicitly states that the essay’s point is to shoot down a clumsy counterargument (along “it wouldn’t cost the ASI a lot to let us live, so we should assume they’d let us live”). So the context (I interpret) is that such requests, however sympathetic, have not been ingrained into the ASI:s goals. Using a different example would mean he was discussing something different.
That is, “just because it would make a trivial difference from the ASI:s perspective to let humanity thrive, whereas it would make an existential difference from humanity’s perspective, doesn’t mean ASIs will let humanity thrive”, assuming such conditions aren’t already baked into their decision-making.
I think Eliezer spends so much time on working from these premises because he believes 1) an unaligned ASI to be the default outcome of current developments, and 2) that all current attempts at alignment will necessarily fail.