It is probably possible to pass that test by exploiting human psychology.
It is probably impossible to do well on that test by trying to convince humans that your viewpoint is right.
You’re talking past orthonormal. You’re assuming a properly-designed AI. He’s saying that accomplishing the task would be strong evidence of unfriendliness.
It is probably possible to pass that test by exploiting human psychology. It is probably impossible to do well on that test by trying to convince humans that your viewpoint is right.
You’re talking past orthonormal. You’re assuming a properly-designed AI. He’s saying that accomplishing the task would be strong evidence of unfriendliness.