I can’t tell which of two arguments you’re making: that there are unknown unknowns, or that myopia isn’t a complete solution.
This is a good argument for all metrics being Goodhearteable, and that if takeover occurs and the AI is incorrigible that’ll cause suboptimal value lock-in (Ie unknown unknowns).
I agree myopia isn’t a complete solution, but it seems better for preventing takeover risk than for preventing social media dysfunction? It seems more easily defineable in the worst case (“don’t do something nearly all humans really dislike” than “make the public square function well”).
I can’t tell which of two arguments you’re making: that there are unknown unknowns, or that myopia isn’t a complete solution.
This is a good argument for all metrics being Goodhearteable, and that if takeover occurs and the AI is incorrigible that’ll cause suboptimal value lock-in (Ie unknown unknowns).
I agree myopia isn’t a complete solution, but it seems better for preventing takeover risk than for preventing social media dysfunction? It seems more easily defineable in the worst case (“don’t do something nearly all humans really dislike” than “make the public square function well”).