gjm comments on Adversarial Policies Beat Professional-Level Go AIs

gjm 5 Nov 2022 0:30 UTC
2 points
0
What KataGo tries to maximize is basically winning probability plus epsilon times score difference. (It’s not exactly that; I don’t remember exactly what it is; but that’s the right kind of idea.) So it mostly wants to win rather than lose, but prefers to win by more if the cost in winning probability is small, which as you say helps to avoid the sort of “slack” moves that AlphaGo and Leela Zero tend to make once the winner is more or less decided.
- ChristianKl 7 Nov 2022 23:37 UTC
  2 points
  0
  Parent
  The problem here seems to be that it’s not preferring to win by more under area rules. If it would prefer by more points under area rules, it would capture all the stones before passing. It doesn’t do that, once it thinks that it has enough points to win anyway under area rules.
  This attack is basically about giving KataGo the impression that it has enough points anyway and doesn’t need to capture stones to win.
  Likely the heuristic of time score difference does not reward getting more points over passing but it does reward playing a move that’s worth more points over a move that’s worth less.
  - gjm 8 Nov 2022 1:25 UTC
    2 points
    0
    Parent
    I’m not sure I understand. With any rules that allow the removal of dead stones, there is no advantage to capturing them. (With territory-scoring rules, capturing them makes you worse off. With area-scoring rules, capturing them makes no difference to the score.) And with rules that don’t allow the removal of dead stones, white is losing outright (and therefore needs to capture those stones even if it’s only winning versus losing that matters). How would caring more about score make KG more inclined to bother capturing the stones?
    - ChristianKl 8 Nov 2022 11:21 UTC
      2 points
      0
      Parent
      With area-scoring rules that don’t allow the removal of dead stones in normal training games, KataGo has to decide whether it can already pass or whether it should go through the work of capturing any remaining stones. I was letting KataGo play one training game and it looked to me like its default strategy in games is not to capture all the stones but only enough to win by a sufficient margin.
      It doesn’t have a habit of “always capturing all the stones to get maximum score under area rules”. If it would have that habit I don’t think it would show this failure case.
      - gjm 8 Nov 2022 12:46 UTC
        2 points
        0
        Parent
        In training games I think the rules it’s using do allow the removal of dead stones. If it chooses not to remove them it isn’t because it’s not caring about points it would have gained by removing them, it’s because it doesn’t think it would gain any points by removing them.
        There is no possible habit of “always capturing all the stones to get maximum score under area rules”. Even under area rules you don’t get more points for capturing the stones (unless the stones are not actually dead according to the rules you’re using, or in human games according to negotiation with the opponent).
        What am I missing?
        ChristianKl 8 Nov 2022 19:03 UTC
        2 points
        0
        Parent
        I think that currently under area scoring rules KataGo behaves in a way that it doesn’t capture all stones that would be dead by human convention but that are not dead by KataGo’s rules provided capturing them isn’t necessary to win the game.
        gjm 8 Nov 2022 19:53 UTC
        2 points
        0
        Parent
        That’s correct, at least roughly—the important difference is that it’s not “isn’t necessary to win the game” but “doesn’t make any difference to the outcome, including score difference”—but I don’t see what it has to do with the more specific thing you said above:
        The problem seems to be that it’s not preferring to win by more under area rules.
        KataGo does prefer to win by more, whatever rules it’s playing under; a stronger preference for winning by more would not (so far as I can see) make any difference to its play in positions like the ones reached by the adversarial agent; KataGo does not generally think “that it has enough points anyway and doesn’t need to capture stones to win” and even if it did that wouldn’t make the difference between playing on and passing in this situation.
        Unless, again, I’m missing something, but we seem to be having some sort of communication difficulty because nothing you write seems to me responsive to what I’m saying (and quite possibly it feels the same way to you, with roles reversed).
        What makes you believe that KataGo is “not preferring to win by more under area rules”?