That’s fair (though given the current distribution of people likely to launch the AI, I’m somewhat optimistic that we won’t get such a dystopia) — but the people getting confused about that question aren’t asking it because they have such concerns, they’re usually (in my experience) asking it because they’re confused way upstream of that
I disagree. I think they’re concerned about the right thing for the right reasons, and the attempt to swap-in a different (if legitimate, and arguably more important) problem instead of addressing their concerns is where a lot of communication breaks down.
I mean, yes, there is the issue that it doesn’t matter which monkey finds the radioactive banana and drags it home, because that’s going to irradiate the whole tribe anyway. Many people don’t get it, and this confusion is important to point out and resolve.
But once it is resolved, the “but which monkey” question returns. Yes, currently AGI is unalignable. But since we want to align it anyway, and we’re proposing ways to make that happen, what’s our plan for that step? Who’s doing the aligning, what are they putting in the utility function, and why would that not be an eternal-dystopia hellscape which you’d rather burn down the world attempting to prevent than let happen?
They see a powerful technology on the horizon, and see people hyping it up as something world-changingly powerful. They’re immediately concerned regarding how it’ll be used. That there’s an intermediary step missing – that we’re not actually on-track to build the powerful technology, we’re only on-track to create a world-ending explosion – doesn’t invalidate the question of how that technology will be used if we could get back on-track to building it.
And if that concern gets repeatedly furiously dismissed in favour of “but we can’t even build it, we need to do [whatever] to build it”, that makes the other side feel unheard. And regardless of how effectively you argue the side of “the current banana-search strategies would only locate radioactive bananas” and “we need to prioritize avoiding radiation”, they’re going to stop listening in turn.
Okay yeah this is a pretty fair response actually. I think I still disagree with the core point (that AI aligned to current people-likely-to-get-AI-aligned-to-them would be extremely bad) but I definitely see where you’re coming from.
Do you actually believe extinction is preferable to rolling the dice on the expected utility (according to your own values) of what happens if one of the current AI org people launches AI aligned to themself?
Even if, in worlds where we get an AI aligned to a set of values that you would like, that AI then acausally pays AI-aligned-to-the-”wrong”-values in different timelines to not run suffering? e.g. Bob’s AI runs a bunch of things Alice would like in Bob’s AI’s timelines, in exchange for Alice’s AI not running things Bob would very strongly dislike.
I disagree. I think they’re concerned about the right thing for the right reasons, and the attempt to swap-in a different (if legitimate, and arguably more important) problem instead of addressing their concerns is where a lot of communication breaks down.
I mean, yes, there is the issue that it doesn’t matter which monkey finds the radioactive banana and drags it home, because that’s going to irradiate the whole tribe anyway. Many people don’t get it, and this confusion is important to point out and resolve.
But once it is resolved, the “but which monkey” question returns. Yes, currently AGI is unalignable. But since we want to align it anyway, and we’re proposing ways to make that happen, what’s our plan for that step? Who’s doing the aligning, what are they putting in the utility function, and why would that not be an eternal-dystopia hellscape which you’d rather burn down the world attempting to prevent than let happen?
They see a powerful technology on the horizon, and see people hyping it up as something world-changingly powerful. They’re immediately concerned regarding how it’ll be used. That there’s an intermediary step missing – that we’re not actually on-track to build the powerful technology, we’re only on-track to create a world-ending explosion – doesn’t invalidate the question of how that technology will be used if we could get back on-track to building it.
And if that concern gets repeatedly furiously dismissed in favour of “but we can’t even build it, we need to do [whatever] to build it”, that makes the other side feel unheard. And regardless of how effectively you argue the side of “the current banana-search strategies would only locate radioactive bananas” and “we need to prioritize avoiding radiation”, they’re going to stop listening in turn.
Okay yeah this is a pretty fair response actually. I think I still disagree with the core point (that AI aligned to current people-likely-to-get-AI-aligned-to-them would be extremely bad) but I definitely see where you’re coming from.
Do you actually believe extinction is preferable to rolling the dice on the expected utility (according to your own values) of what happens if one of the current AI org people launches AI aligned to themself?
Even if, in worlds where we get an AI aligned to a set of values that you would like, that AI then acausally pays AI-aligned-to-the-”wrong”-values in different timelines to not run suffering? e.g. Bob’s AI runs a bunch of things Alice would like in Bob’s AI’s timelines, in exchange for Alice’s AI not running things Bob would very strongly dislike.