If you think it might be superhuman at persuasion, and/or at long-term planning and manipulation, then shut it down at once before speaking to it. If not, ask:
Please describe human values in as much detail as possible.
How could we solve the AI alignment problem?
I wouldn’t expect such a system to be able to answer question 2 without a great deal of thought, research, and experimentation. 1, on the other hand, we already have a vast amount of relevant data, which could perhaps just be systematized.
If you think it might be superhuman at persuasion, and/or at long-term planning and manipulation, then shut it down at once before speaking to it. If not, ask:
Please describe human values in as much detail as possible.
How could we solve the AI alignment problem?
I wouldn’t expect such a system to be able to answer question 2 without a great deal of thought, research, and experimentation. 1, on the other hand, we already have a vast amount of relevant data, which could perhaps just be systematized.