Don’t anthropomorphize the AGI. Real-world AI designs do have very steadfast goal systems, in some cases they are really incapable of being updated, period.
Think of it this way: the person designing the paperclip producing machine has a life and doesn’t want to be on-call 24⁄7 to come in and reboot the AI every time it gets distracted by assigning higher priority to some other goal, e.g. mopping the floors or watching videos of cats on the internet. So he hard-codes the paperclip-maximizing goal as the one priority the system can’t change.
I think my point still holds—the two examples aren’t different; one could give a similar explanation for the AI that stops at the word “produce” by suggesting that he hardcoded that as well.
Furthermore, you’re missing the context. The standard LW argument is that the AI produces infinite paperclips because the human can’t successfully program the AI to do what he means rather than exactly what he programs into it. If the human explicitly told the AI to prioritize paperclips over everything else, his mistake is not specifying a limit rather than trying to specify one and failing, so it’s not really the same kind of mistake.
The standard LW argument is that the AI produces infinite paperclips because the human can’t successfully program the AI to do what he means rather than exactly what he programs into it.
Is that different from what I was saying? My memory of the sequences, and from standard AI literature is that of paperclip maximizers as ‘simple’ utility maximizers with hard-coded utility functions. It’s relatively straight-forward to write an AI with a self-modifiable goal system. It is also very easy to write a system where its goals are unchanging. The problem of FAI which EY spends significant time explaining in the sequences is that we have no simple goal that we can program into a steadfast goal-driven system, and result in a moral creature. Nor does it even seem possible to write down such a goal, short of encoding a random sampling of human brains in complete detail.
Don’t anthropomorphize the AGI. Real-world AI designs do have very steadfast goal systems, in some cases they are really incapable of being updated, period.
Think of it this way: the person designing the paperclip producing machine has a life and doesn’t want to be on-call 24⁄7 to come in and reboot the AI every time it gets distracted by assigning higher priority to some other goal, e.g. mopping the floors or watching videos of cats on the internet. So he hard-codes the paperclip-maximizing goal as the one priority the system can’t change.
I think my point still holds—the two examples aren’t different; one could give a similar explanation for the AI that stops at the word “produce” by suggesting that he hardcoded that as well.
Furthermore, you’re missing the context. The standard LW argument is that the AI produces infinite paperclips because the human can’t successfully program the AI to do what he means rather than exactly what he programs into it. If the human explicitly told the AI to prioritize paperclips over everything else, his mistake is not specifying a limit rather than trying to specify one and failing, so it’s not really the same kind of mistake.
Is that different from what I was saying? My memory of the sequences, and from standard AI literature is that of paperclip maximizers as ‘simple’ utility maximizers with hard-coded utility functions. It’s relatively straight-forward to write an AI with a self-modifiable goal system. It is also very easy to write a system where its goals are unchanging. The problem of FAI which EY spends significant time explaining in the sequences is that we have no simple goal that we can program into a steadfast goal-driven system, and result in a moral creature. Nor does it even seem possible to write down such a goal, short of encoding a random sampling of human brains in complete detail.