In this thread, by alignment I mean the property of not killing everyone (and allowing/facilitating self-determined development of humanity), but this property isn’t necessarily robust and may need to be given an opportunity to preserve/reinforce itself. Lacking superintelligence, such AGIs may face any number of obstacles on that path, even if they can behave unlawfully. In the meantime, creating similarly intelligent misaligned AGIs is trivial (though they don’t have particular advantages over the aligned ones) and creating fooming misaligned AGIs is easier than ever before, to the extent that it’s possible at all.
The situation with fooming aligned AGIs leading to aligned superintelligence might be asymmetric, since preserving complicated values through self-improvement might be a harder technical problem than for misaligned AGIs where the choice of values or more generally of cognitive architecture is unconstrained. Thus there might be a period of time after creation of aligned AGIs where misaligned ASIs are feasible while aligned ASIs are not.
In this thread, by alignment I mean the property of not killing everyone (and allowing/facilitating self-determined development of humanity), but this property isn’t necessarily robust and may need to be given an opportunity to preserve/reinforce itself. Lacking superintelligence, such AGIs may face any number of obstacles on that path, even if they can behave unlawfully. In the meantime, creating similarly intelligent misaligned AGIs is trivial (though they don’t have particular advantages over the aligned ones) and creating fooming misaligned AGIs is easier than ever before, to the extent that it’s possible at all.
The situation with fooming aligned AGIs leading to aligned superintelligence might be asymmetric, since preserving complicated values through self-improvement might be a harder technical problem than for misaligned AGIs where the choice of values or more generally of cognitive architecture is unconstrained. Thus there might be a period of time after creation of aligned AGIs where misaligned ASIs are feasible while aligned ASIs are not.