many of which will allow for satisfaction, while still allowing the AI to kill everyone.
This post is just about alignment of AGI’s behavior with its creator’s intentions, which is what Yoshua Bengio was talking about.
If you wanted to constrain it further, you’d say that in the prompt. But I feel that rigid constraints are probably unhelpful, the way The Three Laws of Robotics are. For example, anyone could threaten suicide and force the AGI to do absolutely anything short of killing other people.
This post is just about alignment of AGI’s behavior with its creator’s intentions, which is what Yoshua Bengio was talking about.
If you wanted to constrain it further, you’d say that in the prompt. But I feel that rigid constraints are probably unhelpful, the way The Three Laws of Robotics are. For example, anyone could threaten suicide and force the AGI to do absolutely anything short of killing other people.