So if I’m understanding it correctly, it’s that maintaining human control is the best option we can formally work toward? The One True Encoding of Human Values would most likely be a more reliable system if we could create it, but that it’s a much harder problem, and not strictly necessary for a good end outcome?
The One True Encoding of Human Values would most likely be a more reliable system if we could create it
My best guess is that this is a confused claim and I have trouble saying “yes” or “no” to it, but I do agree with the spirit of it.
but that it’s a much harder problem, and not strictly necessary for a good end outcome?
Yes, in the same way that if you’re worried about mosquitoes giving you malaria, you start by getting a mosquito net or moving to a place without mosquitoes, you don’t immediately try to kill all mosquitoes in the world.
So if I’m understanding it correctly, it’s that maintaining human control is the best option we can formally work toward? The One True Encoding of Human Values would most likely be a more reliable system if we could create it, but that it’s a much harder problem, and not strictly necessary for a good end outcome?
My best guess is that this is a confused claim and I have trouble saying “yes” or “no” to it, but I do agree with the spirit of it.
Yes, in the same way that if you’re worried about mosquitoes giving you malaria, you start by getting a mosquito net or moving to a place without mosquitoes, you don’t immediately try to kill all mosquitoes in the world.