Writ power differentials, one of my go to examples of real world horrific misalignment is human’s relationships to the rest of the animal kingdom, and the unfortunate fact that as humans got more power via science and capitalism, things turned massively negative for animals. Science and capitalism didn’t create these negative impacts (They’ve been around since the founding of humans), but they supercharged them into S-risks and X-risks for animals. The alignment mechanisms that imperfectly align interspecies relations don’t exist at all in the interspecies case, which lends at least some support to the thesis that alignment will not happen by default.
Now this section is less epistemically sound than the first section, but my own theory of why alignment fails in the interspecies case basically boils down to the following:
Alignment can only happen right now when the capabilities differentials are very limited, and this is roughly the case re intraspecies vs interspecies differences, that is the difference in capabilities from being a different species is quite a bit more heavy tailed and way more different than the differences between the same species.
Now I haven’t made any claim on how difficult alignment turns out to be, only that it probably won’t be achieved by default.
Some top scientists are crazy enough that it would be disastrous to give them absolute power.
I mostly agree with Holden, but think he’s aiming to use AIs with more CIS than is needed or safe.
Writ power differentials, one of my go to examples of real world horrific misalignment is human’s relationships to the rest of the animal kingdom, and the unfortunate fact that as humans got more power via science and capitalism, things turned massively negative for animals. Science and capitalism didn’t create these negative impacts (They’ve been around since the founding of humans), but they supercharged them into S-risks and X-risks for animals. The alignment mechanisms that imperfectly align interspecies relations don’t exist at all in the interspecies case, which lends at least some support to the thesis that alignment will not happen by default.
Now this section is less epistemically sound than the first section, but my own theory of why alignment fails in the interspecies case basically boils down to the following:
Alignment can only happen right now when the capabilities differentials are very limited, and this is roughly the case re intraspecies vs interspecies differences, that is the difference in capabilities from being a different species is quite a bit more heavy tailed and way more different than the differences between the same species.
Now I haven’t made any claim on how difficult alignment turns out to be, only that it probably won’t be achieved by default.