If there exist such a problem that a human can think of, can be solved by a human and verified by a human, an AI would need to be able to solve that problem as well as to pass the Turing test.
If there exist some PhD level intelligent people that can solve the alignment problem, and some that can verify it (which is likely easier). Then an AI that can not solve AI alignment would not pass the Turing test.
With that said, a simplified Turing test with shorter time limits and a smaller group of participants is much more feasible to conduct.
How do you verify a solution to the alignment problem? Or if you don’t have a verification method in mind, why assume it is easier than making a solution?
I’d say that having a way to verify that a solution to the alignment problem is actually a solution, is part of solving the alignment problem.
But I understand this was not clear from my previous response.
A bit like a mathematical question, you’d be expected to be able to show that your solution is correct, not only guess that maybe your solution is correct.
If there exist such a problem that a human can think of, can be solved by a human and verified by a human, an AI would need to be able to solve that problem as well as to pass the Turing test.
If there exist some PhD level intelligent people that can solve the alignment problem, and some that can verify it (which is likely easier). Then an AI that can not solve AI alignment would not pass the Turing test.
With that said, a simplified Turing test with shorter time limits and a smaller group of participants is much more feasible to conduct.
How do you verify a solution to the alignment problem? Or if you don’t have a verification method in mind, why assume it is easier than making a solution?
Great question.
I’d say that having a way to verify that a solution to the alignment problem is actually a solution, is part of solving the alignment problem.
But I understand this was not clear from my previous response.
A bit like a mathematical question, you’d be expected to be able to show that your solution is correct, not only guess that maybe your solution is correct.