I don’t think misaligned AI drives the majority of s-risk (I’m not even sure that s-risk is higher conditioned on misaligned AI), so I’m not convinced that it’s a super relevant communication consideration here.
I’m curious what does, in that case; and what proportion affects humans (and currently-existing people or future minds)? Things like spite threat commitments from a misaligned AI warring with humanity seem like a substantial source of s-risk to me.
I’m curious what does, in that case; and what proportion affects humans (and currently-existing people or future minds)? Things like spite threat commitments from a misaligned AI warring with humanity seem like a substantial source of s-risk to me.