The AU landscape naturally leads to competition because many goals imply seeking power, and [A acquiring a lot of power] tends to be in conflict with [B acquiring a lot of power] because, well, the resources only exist once.
The CCC (catastrophic convergence conjecture) argues that, therefore, unaligned goals with us tend to cause catastrophic consequences if given to a powerful agent. It’s (right now) informal.
The power-framing leads to a division of catastrophes into value-specific vs. objective, where the former ones depend on the goals of an agent, whereas the latter rely on the instrumental convergence idea, i.e., they lower the AU for those goals which are instrumentally convergent (like “stay alive”) and thus lower the AU for lots of different agents (who have different goals).
AU is probably less fragile than values.
The environment contains information about what we value, and can be seen as an inspiration for AI alignment approaches. These approaches arguably work better in the AU framing as supposed to the classical values framing.
Attempt to summarize
The AU landscape naturally leads to competition because many goals imply seeking power, and [A acquiring a lot of power] tends to be in conflict with [B acquiring a lot of power] because, well, the resources only exist once.
The CCC (catastrophic convergence conjecture) argues that, therefore, unaligned goals with us tend to cause catastrophic consequences if given to a powerful agent. It’s (right now) informal.
The power-framing leads to a division of catastrophes into value-specific vs. objective, where the former ones depend on the goals of an agent, whereas the latter rely on the instrumental convergence idea, i.e., they lower the AU for those goals which are instrumentally convergent (like “stay alive”) and thus lower the AU for lots of different agents (who have different goals).
AU is probably less fragile than values.
The environment contains information about what we value, and can be seen as an inspiration for AI alignment approaches. These approaches arguably work better in the AU framing as supposed to the classical values framing.