It seems like inner misalignment is a subset of “we don’t know how to make aligned AI”. Maybe he could’ve fit that in neatly, but adding more is at odds with the function as an intro to AI risk.
It seems like inner misalignment is a subset of “we don’t know how to make aligned AI”. Maybe he could’ve fit that in neatly, but adding more is at odds with the function as an intro to AI risk.