We train AI 2 only on correct solutions produced by humans. Not incorrect ones. Therefore the fact that the output is correct isn’t evidence against it being produced by a human. Though see one of the comments below, that might not be a good idea.
We don’t have a solution for every problem. Only certain problems. We are just conditioning AI two on the fact that the input is a solution to the problem. Therefore learning that it correctly solves the problem does not count as evidence against the AI, if it’s really unlikely humans would have been able to solve it.
That is, we are asking it “what is the probability input was produced by a human, given that it is a correct solution and a specified prior.”
We don’t have a solution for every problem. Only certain problems.
Then, if AI 2 can tell which problems we are more likely to have solved, they can incorporate that into their prior.
We are just conditioning AI two on the fact that the input is a solution to the problem. Therefore learning that it correctly solves the problem does not count as evidence against the AI, if it’s really unlikely humans would have been able to solve it.
I don’t see how that follows. Learning that the input is a solution increases the odds of it being an AI, and you aren’t being very clear on what updates are made by AI 2, what information it’s given, and what the instructions are.
That is, we are asking it “what is the probability input was produced by a human, given that it is a correct solution and a specified prior.”
How do you specify a prior for an AI? If an objective evaluation of the question would yield a probability of X% of something being true, do you expect that you can simply tell the AI to start with a prior of Y%? That’s not obvious.
While if there’s some specific statement that you’re telling the AI to make it start with some prior, you need to make the statement explicit, the prior explicit, etc. As I did above with “condition on there being two solutions, one produced by a human, and the one you have is chosen at random”.
We train AI 2 only on correct solutions produced by humans. Not incorrect ones. Therefore the fact that the output is correct isn’t evidence against it being produced by a human. Though see one of the comments below, that might not be a good idea.
If we already have a solution, why do we need the AI?
We don’t have a solution for every problem. Only certain problems. We are just conditioning AI two on the fact that the input is a solution to the problem. Therefore learning that it correctly solves the problem does not count as evidence against the AI, if it’s really unlikely humans would have been able to solve it.
That is, we are asking it “what is the probability input was produced by a human, given that it is a correct solution and a specified prior.”
Then, if AI 2 can tell which problems we are more likely to have solved, they can incorporate that into their prior.
I don’t see how that follows. Learning that the input is a solution increases the odds of it being an AI, and you aren’t being very clear on what updates are made by AI 2, what information it’s given, and what the instructions are.
How do you specify a prior for an AI? If an objective evaluation of the question would yield a probability of X% of something being true, do you expect that you can simply tell the AI to start with a prior of Y%? That’s not obvious.
While if there’s some specific statement that you’re telling the AI to make it start with some prior, you need to make the statement explicit, the prior explicit, etc. As I did above with “condition on there being two solutions, one produced by a human, and the one you have is chosen at random”.