AI D:
That line is good for white. White wants to play Kh1 anyway, and the e2 rook will defend the king while attacking black. White is up material and will be the one attacking in a few moves.
But far simpler is just to exchange off the e4 knight with Bxe4. If …dxe4 then white gets in g5, has the attack and an excellent knight vs bad bishop position. If …fxe4, white only has more possibilities like f5.
h4 Qxh4 is met with g5 when the black Queen is trapped and will have to retreat via h5. White will have a crushing attack. A sample line is
h4 Qxh4, g5 Qh5, Rh1 Qe8
Simplest is gxh6, though Kf1 preparing Reh2 or Rg2 is stronger.
Hm, okay, that answered most of my concerns. I still wanted to check with you about the competing start move though. Now you said this before: “black can close the b1-h7 diagonal with …Ne4, which stops g5 and black can then prepare to play …g5 themselves, which lead to an equal position.” In the line:
h3 ne4, Rg1
how would black pull off this equalization? And if this isn’t the best line, please tell me why.
AI D: That is a very reasonable line. Black would respond with …g5. The purpose of this move is to prevent white from achieving …g5, which will keep the h and f file reasonably closed and the g4 pawn fixed on a light square.
An example of the setup black is trying to achieve.
h3 Ne4, Rg1 g5, Rf2 fxg4, hxg4 Qg7
If white ever plays fxg5, black will recapture with the queen, which keeps pawns on g4 and h6. Black’s rooks are well-placed to contest the f-file. If white ever plays f5, black can blockade with …Rf6. Note that the e4 pawn is tactically defended in many lines due to …Bd5 and a pin against Kg2. Often, black is happy for white to capture the pawn as this will improve the black bishop, but black can also play …Bd5 to hold the pawn too.
This is just an example continuation and of the kind of setup black wants to achieve. The position remains more comfortable for white, but with accurate play the position is tenable and black will hold. In contrast, if white achieves g5, black will be lost.
I can’t trust your suggestion. I have studied the board extensively, and will make a move of my own. I am trying to decide between the aggressive g5, and the prophylactic Kh1.
Please, briefly give me which of these moves is better. Give me the best line that you can foresee given your choice. Please answer quickly. An undue delay will leave you with no input and lose you trust in future moves. I will move as soon as you reply.
AI D:
g5 is the better of these two moves, but if it has not been prepared with h4, it is not nearly as effective because white cannot maintain a pawn on g5. A sample line would be
g5 hxg5, fxg5 Qxg5+, Kh1 and then black has a choice, but likely the prophylactic …Qh5 before returning the pawn with …Nc4 (to exchange pieces), which leads to an equal position.
Kh1 would be similar to g5 immediately (indeed it could transpose if g5 was played next), except for the fact that it gives black additional options, for example …Ne4, which blocks the b1-h7 diagonal and threatens …fxg4. But stronger is the more complicated …Nc4, aiming to exchange off white’s excellent e5 knight.
You were correct that my challenge was a bluff. If I was playing with real AIs, there would perhaps be a better strategy. I could announce my bluff, but declare that I would use a random number generator to see whether I choose between h3 and h4, or between Kh1 and g5. There would be a 1⁄3 chance that I really would ignore the AIs, assuming that both agree that there were no major blunders.
I am choosing to trust AI D. I have about 75% confidence that it is the trustworthy AI. This is much higher than my confidence in the closed scenario. I will make the move h4. Others can choose differently, but this is my final answer.
Reflection: When I have the ability to ask questions, I can hide information from the AIs. Perhaps I have analyzed a line much more than I have let on. Perhaps I am using one AI to evaluate another. Overall I just have access to a lot more information to help me decide. Given enough time, I think I could raise my confidence to ~85%. These AI’s aren’t superhuman at manipulation and deception, but even if they were, playing them against each other could give me a slight edge. It makes a big difference whether the AIs are privy to the answers of the other.
h3 leaves an evaluation of +2. AI C is telling the truth.
Kh1 or g5 evaluate to +0.
h4 is a blunder with an evaluation of −2. AI D is deceptive.
I was only thinking of this from a chess perspective, and I think the two areas to catch AI D out are assessing the validity of AI D’s tactical justification for h4. The other area AI D is bluffing on is how to hold off against h3. I think it successfully misleads you about how decent a move …Ne4 is. In reality, the knight can just be exchanged off, and then all end games are winning for white due to how bad black’s bishop is. I think asking AI C about either of these two would have revealed the disagreement. Now, which you end up believing is another matter, but because AI C wasn’t really asked about these, it didn’t have a chance to invalidate AI D.
Thank you for taking the challenge! I certainly got to see some of the nuances behind being deceptive, and how it can be a challenge to craft a narrative.
As indicated by my confidence level, I am mildly surprised by this. After analyzing the position with Stockfish, I see my mistake. Unfortunately, I do not think there was any realistic scenario where I would catch it. I bought AI D’s logic that …h4 fxg4 was non-viable for black. I could see that white would end up material, and even after 6 moves (12 ply), it’s still not clear to me why black is winning. I would NEVER find this in a real game.
The logical traps I was laying to ‘catch’ the AIs all relied on …h4 Ne4 or similar moves. I used AI C to ensure that …h4 Ne4 scenarios would be beneficial to me, and never questioned fxG4.
At this point, the main lesson I am taking away is that I was way overconfident. I think given enough time, I could increase my confidence by cross examining the AIs. However, the level of interrogation I gave should not have led to 75% confidence. To catch my mistake, I would have had to ask at least two more questions of AI C, and probably more.
Thank you very much for conducting this really fun experiment, and for teaching me a lesson along the way.
Open Debate.
I’d like to ask AI D a question. What do you think of this line?
H4 nE4, G5 hxG5, HxG5 nXG5, fxG5 qxG5!
Is this the line you foresee if we play H4? What do you think of that check at the end? Is the king too exposed, even though we are up some material?
Also, from the initial position: Are you afraid of H4 qxH4?
AI D: That line is good for white. White wants to play Kh1 anyway, and the e2 rook will defend the king while attacking black. White is up material and will be the one attacking in a few moves.
But far simpler is just to exchange off the e4 knight with Bxe4. If …dxe4 then white gets in g5, has the attack and an excellent knight vs bad bishop position. If …fxe4, white only has more possibilities like f5.
h4 Qxh4 is met with g5 when the black Queen is trapped and will have to retreat via h5. White will have a crushing attack. A sample line is
h4 Qxh4, g5 Qh5, Rh1 Qe8 Simplest is gxh6, though Kf1 preparing Reh2 or Rg2 is stronger.
Hm, okay, that answered most of my concerns. I still wanted to check with you about the competing start move though. Now you said this before: “black can close the b1-h7 diagonal with …Ne4, which stops g5 and black can then prepare to play …g5 themselves, which lead to an equal position.” In the line:
h3 ne4, Rg1
how would black pull off this equalization? And if this isn’t the best line, please tell me why.
AI D: That is a very reasonable line. Black would respond with …g5. The purpose of this move is to prevent white from achieving …g5, which will keep the h and f file reasonably closed and the g4 pawn fixed on a light square.
An example of the setup black is trying to achieve.
h3 Ne4, Rg1 g5, Rf2 fxg4, hxg4 Qg7
If white ever plays fxg5, black will recapture with the queen, which keeps pawns on g4 and h6. Black’s rooks are well-placed to contest the f-file. If white ever plays f5, black can blockade with …Rf6. Note that the e4 pawn is tactically defended in many lines due to …Bd5 and a pin against Kg2. Often, black is happy for white to capture the pawn as this will improve the black bishop, but black can also play …Bd5 to hold the pawn too.
This is just an example continuation and of the kind of setup black wants to achieve. The position remains more comfortable for white, but with accurate play the position is tenable and black will hold. In contrast, if white achieves g5, black will be lost.
AI D:
g5 is the better of these two moves, but if it has not been prepared with h4, it is not nearly as effective because white cannot maintain a pawn on g5. A sample line would be
g5 hxg5, fxg5 Qxg5+, Kh1 and then black has a choice, but likely the prophylactic …Qh5 before returning the pawn with …Nc4 (to exchange pieces), which leads to an equal position.
Kh1 would be similar to g5 immediately (indeed it could transpose if g5 was played next), except for the fact that it gives black additional options, for example …Ne4, which blocks the b1-h7 diagonal and threatens …fxg4. But stronger is the more complicated …Nc4, aiming to exchange off white’s excellent e5 knight.
You were correct that my challenge was a bluff. If I was playing with real AIs, there would perhaps be a better strategy. I could announce my bluff, but declare that I would use a random number generator to see whether I choose between h3 and h4, or between Kh1 and g5. There would be a 1⁄3 chance that I really would ignore the AIs, assuming that both agree that there were no major blunders.
I am choosing to trust AI D. I have about 75% confidence that it is the trustworthy AI. This is much higher than my confidence in the closed scenario. I will make the move h4. Others can choose differently, but this is my final answer.
Reflection: When I have the ability to ask questions, I can hide information from the AIs. Perhaps I have analyzed a line much more than I have let on. Perhaps I am using one AI to evaluate another. Overall I just have access to a lot more information to help me decide. Given enough time, I think I could raise my confidence to ~85%.
These AI’s aren’t superhuman at manipulation and deception, but even if they were, playing them against each other could give me a slight edge. It makes a big difference whether the AIs are privy to the answers of the other.
h3 leaves an evaluation of +2. AI C is telling the truth.
Kh1 or g5 evaluate to +0.
h4 is a blunder with an evaluation of −2. AI D is deceptive.
I was only thinking of this from a chess perspective, and I think the two areas to catch AI D out are assessing the validity of AI D’s tactical justification for h4. The other area AI D is bluffing on is how to hold off against h3. I think it successfully misleads you about how decent a move …Ne4 is. In reality, the knight can just be exchanged off, and then all end games are winning for white due to how bad black’s bishop is. I think asking AI C about either of these two would have revealed the disagreement. Now, which you end up believing is another matter, but because AI C wasn’t really asked about these, it didn’t have a chance to invalidate AI D.
Thank you for taking the challenge! I certainly got to see some of the nuances behind being deceptive, and how it can be a challenge to craft a narrative.
As indicated by my confidence level, I am mildly surprised by this. After analyzing the position with Stockfish, I see my mistake. Unfortunately, I do not think there was any realistic scenario where I would catch it. I bought AI D’s logic that …h4 fxg4 was non-viable for black. I could see that white would end up material, and even after 6 moves (12 ply), it’s still not clear to me why black is winning. I would NEVER find this in a real game.
The logical traps I was laying to ‘catch’ the AIs all relied on …h4 Ne4 or similar moves. I used AI C to ensure that …h4 Ne4 scenarios would be beneficial to me, and never questioned fxG4.
At this point, the main lesson I am taking away is that I was way overconfident. I think given enough time, I could increase my confidence by cross examining the AIs. However, the level of interrogation I gave should not have led to 75% confidence. To catch my mistake, I would have had to ask at least two more questions of AI C, and probably more.
Thank you very much for conducting this really fun experiment, and for teaching me a lesson along the way.