One wonders if it might be easier to make it so that AI would “adequately care” about other sentient minds (their interests, well-being, and freedom) instead of trying to align it to complex and difficult-to-specify “human values”.
Would this kind of “limited form of alignment” be adequate as a protection against X-risks and S-risks?
In particular, might it be easier to make such a “superficially simple” value robust with respect to “sharp left turns”, compared to complicated values?
Might it be possible to achieve something like this even for AI systems which are not steerable in general? (Given that what we are aiming for here is just a constraint, but is compatible with a wide variety of approaches to AI goals and values, and even compatible with an approach which lets AI to discover its own goals and values in an open-ended fashion otherwise)?
Should we describe such an approach using the word “alignment”? (Perhaps, “partial alignment” might be an adequate term as a possible compromise.)
Right. In connection with this:
One wonders if it might be easier to make it so that AI would “adequately care” about other sentient minds (their interests, well-being, and freedom) instead of trying to align it to complex and difficult-to-specify “human values”.
Would this kind of “limited form of alignment” be adequate as a protection against X-risks and S-risks?
In particular, might it be easier to make such a “superficially simple” value robust with respect to “sharp left turns”, compared to complicated values?
Might it be possible to achieve something like this even for AI systems which are not steerable in general? (Given that what we are aiming for here is just a constraint, but is compatible with a wide variety of approaches to AI goals and values, and even compatible with an approach which lets AI to discover its own goals and values in an open-ended fashion otherwise)?
Should we describe such an approach using the word “alignment”? (Perhaps, “partial alignment” might be an adequate term as a possible compromise.)