Perhaps we should focus on alignment problems that only appear for more powerful systems, as a form of differential technological development. Those problems are harder (will require more thought to solve), and are less economically useful to solve in the near-term.
How do you practically do that? We don’t know what they are, and that seems to be assuming our present progress, e.g. in Mechanical Interpretability doesn’t help at all. Such work requires the existence of more powerful systems than exist today surely?
OK but what is your plan for a positive Singularity? Just putting AGI/ASI off by say 1 year doesn’t necessarily give a better outcome at all.
Perhaps we should focus on alignment problems that only appear for more powerful systems, as a form of differential technological development. Those problems are harder (will require more thought to solve), and are less economically useful to solve in the near-term.
How do you practically do that? We don’t know what they are, and that seems to be assuming our present progress, e.g. in Mechanical Interpretability doesn’t help at all. Such work requires the existence of more powerful systems than exist today surely?