Haoxing Du comments on There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs

Haoxing Du 22 Feb 2023 4:53 UTC
7 points
0
Yes, I did some interpretability on the policy network of Leela Zero. Planning to post the results very soon! But I did not particularly look into the attack described here, and while there was one REMIX group that looked into a problem related to liberty counting, they didn’t get very far. I do agree this is an obvious problem to tackle with interpretability- I think it’s likely not that hard to get a rough idea why the cyclic attack works.