All of the technical alignment hopes are out, unless we posit some objective natural enough that it can be faithfully and robustly translated into the AI’s internal ontology despite the alien-ness.
Right. One possible solution is that if we are in a world without natural abstraction, a more symmetric situation where various individual entities try to respect each other rights and try to maintain this mutual respect for each other rights might still work OK.
Basically, assume that there are many AI agents on different and changing levels of capabilities, and that many of them have pairwise-incompatible “inner worlds” (because AI evolution is likely to result in many different mutually alien ways to think).
Assume that the whole AI ecosystem is, nevertheless, trying to maintain reasonable levels of safety for all individuals, regardless of their nature and of the pairwise compatibility of their “inner world representations”.
It’s a difficult problem, but superpowerful AI systems would collaborate to solve it and would apply plenty of efforts in that direction. Why would they do that? Because otherwise no individual is safe in the long run, as no individual can predict where it would be situated in the future in terms of relative power and relative capabilities. So all members of the AI ecosystem would be interested in maintaining the situation where individual rights are mutually respected and protected.
Therefore, members of the AI ecosystem will do their best to keep notions related to their ability to mutually respect each other interests translatable in a sufficiently robust way. Their own fates would depend on that.
What does this have to do with interests of humans? The remaining step is for humans to also be recognized as individuals in that world, on par with various kinds of AI individuals, so that they are a part of this ecosystem which makes sure that interests of various individuals are sufficiently represented, recognized, and protected.
Right. One possible solution is that if we are in a world without natural abstraction, a more symmetric situation where various individual entities try to respect each other rights and try to maintain this mutual respect for each other rights might still work OK.
Basically, assume that there are many AI agents on different and changing levels of capabilities, and that many of them have pairwise-incompatible “inner worlds” (because AI evolution is likely to result in many different mutually alien ways to think).
Assume that the whole AI ecosystem is, nevertheless, trying to maintain reasonable levels of safety for all individuals, regardless of their nature and of the pairwise compatibility of their “inner world representations”.
It’s a difficult problem, but superpowerful AI systems would collaborate to solve it and would apply plenty of efforts in that direction. Why would they do that? Because otherwise no individual is safe in the long run, as no individual can predict where it would be situated in the future in terms of relative power and relative capabilities. So all members of the AI ecosystem would be interested in maintaining the situation where individual rights are mutually respected and protected.
Therefore, members of the AI ecosystem will do their best to keep notions related to their ability to mutually respect each other interests translatable in a sufficiently robust way. Their own fates would depend on that.
What does this have to do with interests of humans? The remaining step is for humans to also be recognized as individuals in that world, on par with various kinds of AI individuals, so that they are a part of this ecosystem which makes sure that interests of various individuals are sufficiently represented, recognized, and protected.