There’s two other ways for things to go wrong though:
AI capabilities research switches attention from LLM (back) to RL. (There was a lot of debate in the early days of IDA about whether it would be competitive with RL, and part of that was about whether all the important tasks we want a highly capable AI to do could be broken down easily enough and well enough.)
The task decomposition part starts working well enough, but Eliezer’s (and others’) concern about “preserving alignment while amplifying capabilities” proves valid.
There’s two other ways for things to go wrong though:
AI capabilities research switches attention from LLM (back) to RL. (There was a lot of debate in the early days of IDA about whether it would be competitive with RL, and part of that was about whether all the important tasks we want a highly capable AI to do could be broken down easily enough and well enough.)
The task decomposition part starts working well enough, but Eliezer’s (and others’) concern about “preserving alignment while amplifying capabilities” proves valid.