Perhaps we should focus on alignment problems that only appear for more powerful systems, as a form of differential technological development. Those problems are harder (will require more thought to solve), and are less economically useful to solve in the near-term.
How do you practically do that? We don’t know what they are, and that seems to be assuming our present progress, e.g. in Mechanical Interpretability doesn’t help at all. Such work requires the existence of more powerful systems than exist today surely?
Perhaps we should focus on alignment problems that only appear for more powerful systems, as a form of differential technological development. Those problems are harder (will require more thought to solve), and are less economically useful to solve in the near-term.
How do you practically do that? We don’t know what they are, and that seems to be assuming our present progress, e.g. in Mechanical Interpretability doesn’t help at all. Such work requires the existence of more powerful systems than exist today surely?