If an RL agent can’t solve a task, then I’m fine with amplification being unable to solve it.
I guess by “RL agent” you mean RL agents of certain specific designs, such as the one you just blogged about, and not RL agents in general, since as far as we know there aren’t any tasks that RL agents in general can’t solve?
BTW, I find it hard to understand your overall optimism (only 10-20% expected value loss from AI risk), since there are so many disjunctive risks to just being able to design an aligned AI that’s competitive with certain kinds of RL agents (such as not solving one of the obstacles you list in the OP), and even if we succeed in doing that we’d have to come up with more capable aligned designs that would be competitive with more advanced RL (or other kinds of) agents. Have you explained this optimism somewhere?
If an RL agent can’t solve a task, then I’m fine with amplification being unable to solve it.
I guess by “RL agent” you mean RL agents of certain specific designs, such as the one you just blogged about, and not RL agents in general, since as far as we know there aren’t any tasks that RL agents in general can’t solve?
BTW, I find it hard to understand your overall optimism (only 10-20% expected value loss from AI risk), since there are so many disjunctive risks to just being able to design an aligned AI that’s competitive with certain kinds of RL agents (such as not solving one of the obstacles you list in the OP), and even if we succeed in doing that we’d have to come up with more capable aligned designs that would be competitive with more advanced RL (or other kinds of) agents. Have you explained this optimism somewhere?