Again, it seems very clear and I would think it self-evident and inextricable to discuss that even in full alignment with human values, AGI could easily become supremely destructive of us and themselves.
Are humans not self-destructive, despite and often much because of even our highest values & best efforts?
”...ML-based agents developing the capability to seek influence… This would not be a problem if they do so only in the ways that are aligned with human values.”
Yikes. How is it possible to make such an extraordinarily contra-evidenced assumption?
Please reconsider that fundamental premise, as it informs all further approaches.
Again, it seems very clear and I would think it self-evident and inextricable to discuss that even in full alignment with human values, AGI could easily become supremely destructive of us and themselves.
Are humans not self-destructive, despite and often much because of even our highest values & best efforts?
”...ML-based agents developing the capability to seek influence… This would not be a problem if they do so only in the ways that are aligned with human values.”
Yikes. How is it possible to make such an extraordinarily contra-evidenced assumption?
Please reconsider that fundamental premise, as it informs all further approaches.