any person should want it aligned to themself. i want it aligned to me, you want it aligned to you. we can probly expect it to be aligned to whatever engineer or engineers happens to be there when the aligned AI is launched.
which is fine, because they’re probly aligned enough with me or you (cosmopolitan values, CEV which values everyone’s values also getting CEV’d, etc). hopefully.
But that is exactly the point of the author of this post (which I agree with). AGI that can be aligned to literally anyone is more dangerous in the presence of bad actors than non-alignable AGI.
Also “any person should want it aligned to themself” doesn’t really matter unless “any person” can get access to AGI which would absolutely not be the case, at the very least in the beginning and probably—never.
So when we align AI, who we align it TO?
any person should want it aligned to themself. i want it aligned to me, you want it aligned to you. we can probly expect it to be aligned to whatever engineer or engineers happens to be there when the aligned AI is launched.
which is fine, because they’re probly aligned enough with me or you (cosmopolitan values, CEV which values everyone’s values also getting CEV’d, etc). hopefully.
But that is exactly the point of the author of this post (which I agree with). AGI that can be aligned to literally anyone is more dangerous in the presence of bad actors than non-alignable AGI.
Also “any person should want it aligned to themself” doesn’t really matter unless “any person” can get access to AGI which would absolutely not be the case, at the very least in the beginning and probably—never.