Okay yeah this is a pretty fair response actually. I think I still disagree with the core point (that AI aligned to current people-likely-to-get-AI-aligned-to-them would be extremely bad) but I definitely see where you’re coming from.
Do you actually believe extinction is preferable to rolling the dice on the expected utility (according to your own values) of what happens if one of the current AI org people launches AI aligned to themself?
Even if, in worlds where we get an AI aligned to a set of values that you would like, that AI then acausally pays AI-aligned-to-the-”wrong”-values in different timelines to not run suffering? e.g. Bob’s AI runs a bunch of things Alice would like in Bob’s AI’s timelines, in exchange for Alice’s AI not running things Bob would very strongly dislike.
Okay yeah this is a pretty fair response actually. I think I still disagree with the core point (that AI aligned to current people-likely-to-get-AI-aligned-to-them would be extremely bad) but I definitely see where you’re coming from.
Do you actually believe extinction is preferable to rolling the dice on the expected utility (according to your own values) of what happens if one of the current AI org people launches AI aligned to themself?
Even if, in worlds where we get an AI aligned to a set of values that you would like, that AI then acausally pays AI-aligned-to-the-”wrong”-values in different timelines to not run suffering? e.g. Bob’s AI runs a bunch of things Alice would like in Bob’s AI’s timelines, in exchange for Alice’s AI not running things Bob would very strongly dislike.