A note is that as it turns out, OthelloGPT learned a bag of heuristics, and there was no clean algorithm:
https://www.lesswrong.com/posts/gcpNuEZnxAPayaKBY/othellogpt-learned-a-bag-of-heuristics-1
A note is that as it turns out, OthelloGPT learned a bag of heuristics, and there was no clean algorithm:
https://www.lesswrong.com/posts/gcpNuEZnxAPayaKBY/othellogpt-learned-a-bag-of-heuristics-1