Modeling Strong and Human-Like Gameplay with KL-Regularized Search—we read this one on the transhumanists in vr discord server to figure out what they were testing and what results they got. key takeaways according to me, note that I could be quite wrong about the paper’s implications:
Multi-agent game dynamics change significantly as you add more coherent search and it becomes harder to do linear learning to approximate the search. (no surprise, really.)
it still takes a lot of search.
guiding the search is not hopeless in the presence of noise!
shallow, no-planning equilibrium in imperfect-information games is not improved in emulating a reference policy as much? this seems to make sense as an additional confirmation of the basic hypothesis “search helps model searching beings”.
Modeling Strong and Human-Like Gameplay with KL-Regularized Search—we read this one on the transhumanists in vr discord server to figure out what they were testing and what results they got. key takeaways according to me, note that I could be quite wrong about the paper’s implications:
Multi-agent game dynamics change significantly as you add more coherent search and it becomes harder to do linear learning to approximate the search. (no surprise, really.)
it still takes a lot of search.
guiding the search is not hopeless in the presence of noise!
shallow, no-planning equilibrium in imperfect-information games is not improved in emulating a reference policy as much? this seems to make sense as an additional confirmation of the basic hypothesis “search helps model searching beings”.