As someone who plays a lot of go, this result looks very suspicious to me. To me it looks like the primary reason this attack works is due to an artifact of the automatic scoring system used in the attack. I don’t think this attack would be replicable in other games, or even KataGo trained on a correct implementation.
In the example included on the website, KataGo (White) is passing because it correctly identifies the adversary’s (Black) stones as dead meaning the entire outside would be its territory. Playing any move in KataGo’s position would gain no points (and lose a point under Japanese scoring rules), so KataGo passes.
The game then ends and the automatic scoring system designates the outside as undecided, granting white 0 points and giving black the win.
If the match were to be played between two human players, they would have to agree whether the outside territory belongs to white or not. If black were to claim their outside stones are alive the game would continue until both players pass and agree about the status of all territory (see ‘disputes’ in the AGA ruleset).
But in the adversarial attack, the game ends after the pass and black gets the win due to the automatic scoring system deciding the outcome. But the only reason that KataGo passed is that it correctly inferred that it was in a winning position with no way to increase its winning probability! Claiming that to be a successful adversarial attack rings a bit hollow to me.
I wouldn’t conclude anything from this attack, other than that Go is a game with a lot of edge-cases that need to be correctly handled.
EDIT: I just noticed the authors address this on the website, but I still think this significantly diminishes the ‘impressiveness’ of the adversarial attack. I don’t know the exact ruleset KataGo is trained under, but unless it’s the exact same as the ruleset used to evaluate the adversarial attack, the attack only works due to KataGo playing to win a different game than the adversary.
Note that when given additional search, KataGo realizes that it will lose here and doesn’t fall for the attack, which seems to suggest that it’s not just a rules discrepancy.
Yeah, my original claim is wrong. It’s clear that KataGo is just playing sub-optimally outside of distribution, rather than punished for playing optimally under a different ruleset than its being evaluated.
No, the KataGo paper explicitly states at the start of page 4:
”Self play games used Tromp-Taylor rules [21] modified to not require capturing stones within pass-aliveterritory”
Had KataGo been trained on unmodified Tromp-Taylor rules, the attack would not have worked. The attack only works because the authors are having KataGo play under a different ruleset than it was trained on.
If I have the details right, I am honestly very confused about what the authors are trying to prove with this paper. Given their Twitter announcement claimed that the rulesets were the same my best guess is simply that it was an oversight on their part.
(EDIT: this modification doesn’t matter, the authors are right, I am wrong. See my comment below)
Actually this modification shouldn’t matter. After looking into the definition of pass-alive, the dead stones in the adversarial attacks are clearly not pass-alive.
Under both unmodified and pass-alive modified tromp-taylor rules, KataGo would lose here and its surprising that self-play left such a weakness.
The authors are definitely onto something, and my original claim that the attack only works due to kataGo being trained under a different rule-set is incorrect.
It doesn’t matter whether the dead stones are pass-alive. It matters whether the white stones surrounding the territory they’re in are pass-alive.
Having said that, in e.g. the first example position shown on the attackers’ webpage those white stones are not pass-alive, so the situation isn’t quite “this is a position in which KG would have won under its training conditions”. But it is a position that superficially looks like such a position, which I think is relevant since what’s going on with this attack is that they’ve found positions where KataGo’s “snap judgement”, when it gets little or no searching, gets it wrong.
No. KataGo loses in their examples because it doesn’t capture stones within pass-alive territory. It’s training rules are modified so it doesn’t need to do that.
As someone who plays a lot of go, this result looks very suspicious to me. To me it looks like the primary reason this attack works is due to an artifact of the automatic scoring system used in the attack. I don’t think this attack would be replicable in other games, or even KataGo trained on a correct implementation.
In the example included on the website, KataGo (White) is passing because it correctly identifies the adversary’s (Black) stones as dead meaning the entire outside would be its territory. Playing any move in KataGo’s position would gain no points (and lose a point under Japanese scoring rules), so KataGo passes.
The game then ends and the automatic scoring system designates the outside as undecided, granting white 0 points and giving black the win.
If the match were to be played between two human players, they would have to agree whether the outside territory belongs to white or not. If black were to claim their outside stones are alive the game would continue until both players pass and agree about the status of all territory (see ‘disputes’ in the AGA ruleset).
But in the adversarial attack, the game ends after the pass and black gets the win due to the automatic scoring system deciding the outcome. But the only reason that KataGo passed is that it correctly inferred that it was in a winning position with no way to increase its winning probability! Claiming that to be a successful adversarial attack rings a bit hollow to me.
I wouldn’t conclude anything from this attack, other than that Go is a game with a lot of edge-cases that need to be correctly handled.
EDIT: I just noticed the authors address this on the website, but I still think this significantly diminishes the ‘impressiveness’ of the adversarial attack. I don’t know the exact ruleset KataGo is trained under, but unless it’s the exact same as the ruleset used to evaluate the adversarial attack, the attack only works due to KataGo playing to win a different game than the adversary.
Note that when given additional search, KataGo realizes that it will lose here and doesn’t fall for the attack, which seems to suggest that it’s not just a rules discrepancy.
Yeah, my original claim is wrong. It’s clear that KataGo is just playing sub-optimally outside of distribution, rather than punished for playing optimally under a different ruleset than its being evaluated.
The KataGo paper says of its training, “Self-play games used Tromp-Taylor rules modified to not require capturing stones within pass-alive territory”.
It sounds to me like this is the same scoring system as used in the adversarial attack paper, but I don’t know enough about Go to be sure.
No, the KataGo paper explicitly states at the start of page 4:
”Self play games used Tromp-Taylor rules [21] modified to not require capturing stones within pass-aliveterritory”
Had KataGo been trained on unmodified Tromp-Taylor rules, the attack would not have worked. The attack only works because the authors are having KataGo play under a different ruleset than it was trained on.
If I have the details right, I am honestly very confused about what the authors are trying to prove with this paper. Given their Twitter announcement claimed that the rulesets were the same my best guess is simply that it was an oversight on their part.
(EDIT: this modification doesn’t matter, the authors are right, I am wrong. See my comment below)
Actually this modification shouldn’t matter. After looking into the definition of pass-alive, the dead stones in the adversarial attacks are clearly not pass-alive.
Under both unmodified and pass-alive modified tromp-taylor rules, KataGo would lose here and its surprising that self-play left such a weakness.
The authors are definitely onto something, and my original claim that the attack only works due to kataGo being trained under a different rule-set is incorrect.
It doesn’t matter whether the dead stones are pass-alive. It matters whether the white stones surrounding the territory they’re in are pass-alive.
Having said that, in e.g. the first example position shown on the attackers’ webpage those white stones are not pass-alive, so the situation isn’t quite “this is a position in which KG would have won under its training conditions”. But it is a position that superficially looks like such a position, which I think is relevant since what’s going on with this attack is that they’ve found positions where KataGo’s “snap judgement”, when it gets little or no searching, gets it wrong.
No. KataGo loses in their examples because it doesn’t capture stones within pass-alive territory. It’s training rules are modified so it doesn’t need to do that.