Disclaimer: everything I know about AI I learned from reading stuff on the internet, mostly on this site.
Any self-modifying general intelligence cannot be bound by a utility function. The very nature of being able to self-modify means that the utility function is open to modification or being ignored in favor of a new function.
A self-improving process cannot know which variation is better without external feedback. For humans, that is the physics of the universe, and the behavior of other humans. For a program, it can’t get the feedback to know what “improvement” is without affecting the same.
A superintelligence will have different ethics. But we have different ethics than humans 1000 years ago. Or even other humans alive right now. Should we be seeking to impose our values on something that has a superior grasp of reality?
An AI can only destroy everything by acting on the physical world. Which means telling humans to do things. And usually paying them.
4a. Simple AI safety step: ban all crypto
4b. Coordinating people is a hard problem with unique solutions each time. There is no corpus of training data for it. No-one wr9te down exactly what every foreman and manager said to get the Tokyo Olympics to happen, or any other large scale project.
4c. See #3 above. People have different values and priorities from each-other. There is literally nothing an AI could attempt to do that would not be in direct opposition to someone’s deeply held beliefs.
Naive comments on AGIlignment
Disclaimer: everything I know about AI I learned from reading stuff on the internet, mostly on this site.
Any self-modifying general intelligence cannot be bound by a utility function. The very nature of being able to self-modify means that the utility function is open to modification or being ignored in favor of a new function.
A self-improving process cannot know which variation is better without external feedback. For humans, that is the physics of the universe, and the behavior of other humans. For a program, it can’t get the feedback to know what “improvement” is without affecting the same.
A superintelligence will have different ethics. But we have different ethics than humans 1000 years ago. Or even other humans alive right now. Should we be seeking to impose our values on something that has a superior grasp of reality?
An AI can only destroy everything by acting on the physical world. Which means telling humans to do things. And usually paying them. 4a. Simple AI safety step: ban all crypto 4b. Coordinating people is a hard problem with unique solutions each time. There is no corpus of training data for it. No-one wr9te down exactly what every foreman and manager said to get the Tokyo Olympics to happen, or any other large scale project. 4c. See #3 above. People have different values and priorities from each-other. There is literally nothing an AI could attempt to do that would not be in direct opposition to someone’s deeply held beliefs.