We do not have a scientific understanding of how to tell a superintelligent machine to “solve problem X, without doing something horrible as a side effect”, because we cannot describe mathematically what “something horrible” actually means to us...
Similar to how utility theory (from von Neumann and so on) is excellent science/mathematics despite our not being able to state what utility is. AI Alignment hopes to tell us how to align AI, not the target to aim for. Choosing the target is also a necessary task, but it’s not the focus of the field.
Similar to how utility theory (from von Neumann and so on) is excellent science/mathematics despite our not being able to state what utility is. AI Alignment hopes to tell us how to align AI, not the target to aim for. Choosing the target is also a necessary task, but it’s not the focus of the field.
It is not a quote but a paraphrasing of what the OP might agree on about AI security.