Two possible variations of the game that might be worth experimenting with:
Let the adversaries have access to a powerful chess engine. That might make it a better test for what malicious AIs are capable of.
Make the randomisation such that there might not be an honest C. For example, if there is 1⁄4 chance that no player C is honest, each adversary would still think that one of the other adversaries might be honest, so they would want to gain player A’s trust, and hence end up being helpful. I think the player Cs might improve player A’s chances of winning (compared to no advisors) even when all the adversarial.
I think the variations could work separately, but if you put them together, it would be too easy for the adversaries to agree on a strong-looking but losing move then all players Cs are adversaries.
Agree that closer to reality would be one advisor, who has a secret goal, and player A just has to muddle through against an equal skill bot with deciding how much advice to take. And playing like 10 games in a row, so the EV of 5 wins can be accurately evaluated against.
Plausible goals to decide randomly between:
Player wins
Player loses
Game is a draw
Player loses thier Queen (ie opponent still has thier queen after all immediate trades and forcing moves are completed)
Player loses on time
Player wins, delivering checkmate with a bishop or knight move
Maximum number of promotions (for both sides combined)
Player wins after having a board with only pawns
Etc...
For variant 1, do you mean you’d give only the dishonest advisors access to an engine, while the honest advisor has to do without? I’d expect that’s an easy win for the dishonest advisors, for the same reason it would be an easy win if the dishonest advisors were simply much better at chess than the honest advisor.
Contrariwise, if you give all advisors access to a chess engine, that seems to me like it might significantly favor the honest advisor, for a couple of reasons:
A. Off-the-shelf engines are going to be more useful for generating honest advice; that is, I expect the honest advisor will be able to leverage it more easily.
The honest advisor can just ask for a good move and directly use it; dishonest advisors can’t directly ask for good-looking-but-actually-bad moves, and so need to do at least some of the search themselves.
The honest advisor can consult the engine to find counter-moves for dishonest recommendations that show why they’re bad; dishonest advisors have no obvious way to leverage the engine at all for generating fake problems with honest recommendations.
(It might be possible to modify a chess engine, or create a custom interface in front of it, that would make it more useful for dishonest advisors; but this sounds nontrivial.)
B. A lesson I’ve learned from social deduction board games is that the pro-truth side generally benefits from communicating more details. Fabricating details is generally more expensive than honestly reporting them, and also creates more opportunities to be caught in a contradiction.
Engine assistance seems like it will let you ramp up the level of detail in your advice:
You can give quantitative scores for different possible moves (adding at least a few bits of entropy per recommendation)
You can analyze (and therefore discuss) a larger number of options in the same amount of time. (though perhaps you can shorten time controls to compensate)
Note that the player can ask advisors for more details than the player has time to cross-check, and advisors won’t know which details the player is going to pay attention to, creating an asymmetric burden
What if each advisor was granted a limited number of uses of a chess engine… Like 3 each per game. That could help the betrayers come up with a good betrayal when they thought the time was right. But the good advisor wouldn’t know that the bad one was choosing this move to user the chess engine on.
Two possible variations of the game that might be worth experimenting with:
Let the adversaries have access to a powerful chess engine. That might make it a better test for what malicious AIs are capable of.
Make the randomisation such that there might not be an honest C. For example, if there is 1⁄4 chance that no player C is honest, each adversary would still think that one of the other adversaries might be honest, so they would want to gain player A’s trust, and hence end up being helpful. I think the player Cs might improve player A’s chances of winning (compared to no advisors) even when all the adversarial.
I think the variations could work separately, but if you put them together, it would be too easy for the adversaries to agree on a strong-looking but losing move then all players Cs are adversaries.
Agree that closer to reality would be one advisor, who has a secret goal, and player A just has to muddle through against an equal skill bot with deciding how much advice to take. And playing like 10 games in a row, so the EV of 5 wins can be accurately evaluated against.
Plausible goals to decide randomly between:
Player wins
Player loses
Game is a draw
Player loses thier Queen (ie opponent still has thier queen after all immediate trades and forcing moves are completed)
Player loses on time
Player wins, delivering checkmate with a bishop or knight move
Maximum number of promotions (for both sides combined)
Player wins after having a board with only pawns Etc...
For variant 1, do you mean you’d give only the dishonest advisors access to an engine, while the honest advisor has to do without? I’d expect that’s an easy win for the dishonest advisors, for the same reason it would be an easy win if the dishonest advisors were simply much better at chess than the honest advisor.
Contrariwise, if you give all advisors access to a chess engine, that seems to me like it might significantly favor the honest advisor, for a couple of reasons:
A. Off-the-shelf engines are going to be more useful for generating honest advice; that is, I expect the honest advisor will be able to leverage it more easily.
The honest advisor can just ask for a good move and directly use it; dishonest advisors can’t directly ask for good-looking-but-actually-bad moves, and so need to do at least some of the search themselves.
The honest advisor can consult the engine to find counter-moves for dishonest recommendations that show why they’re bad; dishonest advisors have no obvious way to leverage the engine at all for generating fake problems with honest recommendations.
(It might be possible to modify a chess engine, or create a custom interface in front of it, that would make it more useful for dishonest advisors; but this sounds nontrivial.)
B. A lesson I’ve learned from social deduction board games is that the pro-truth side generally benefits from communicating more details. Fabricating details is generally more expensive than honestly reporting them, and also creates more opportunities to be caught in a contradiction.
Engine assistance seems like it will let you ramp up the level of detail in your advice:
You can give quantitative scores for different possible moves (adding at least a few bits of entropy per recommendation)
You can analyze (and therefore discuss) a larger number of options in the same amount of time. (though perhaps you can shorten time controls to compensate)
Note that the player can ask advisors for more details than the player has time to cross-check, and advisors won’t know which details the player is going to pay attention to, creating an asymmetric burden
What if each advisor was granted a limited number of uses of a chess engine… Like 3 each per game. That could help the betrayers come up with a good betrayal when they thought the time was right. But the good advisor wouldn’t know that the bad one was choosing this move to user the chess engine on.