Looks like it’ll be solved fairly soon if DM keeps at it: https://arxiv.org/search/?query=hanabi&searchtype=all&source=header (bibliography). The self-play agents are already pretty much or past human-level, and the play-with-humans agents aren’t terribly far: at >16 of 20 points. And since multi-player coordination seems to be a blessing-of-scale in simply throwing lots of agents/checkpoints in to force generalization/meta-learning, I expect that probably it could be solved solely with existing agents and a not-too-exorbitant amount of TPU time.
Looks like it’ll be solved fairly soon if DM keeps at it: https://arxiv.org/search/?query=hanabi&searchtype=all&source=header (bibliography). The self-play agents are already pretty much or past human-level, and the play-with-humans agents aren’t terribly far: at >16 of 20 points. And since multi-player coordination seems to be a blessing-of-scale in simply throwing lots of agents/checkpoints in to force generalization/meta-learning, I expect that probably it could be solved solely with existing agents and a not-too-exorbitant amount of TPU time.