But is AlphaGo using just the policy network (which still beats pros) ‘neurosymbolic’? What about MuZero, is that neurosymbolic?
The current approach is heavy on the symbolic program part, but I think it would be entirely possible to distill the programs down into the LLM, and simply have the LLM generate/model sequences of image-tokens++program-tokens++image-tokens, and learn from any ‘neurosymbolic’ approach: https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt/comment/59334256 (And then you would have a pure LLM solver which at no point actually runs a Python program, it merely uses them as a scaffold for inner-monologue; and given results on inner-monologue distillation, likely even the Python tokens could be trained away.)
But is AlphaGo using just the policy network (which still beats pros) ‘neurosymbolic’? What about MuZero, is that neurosymbolic?
The current approach is heavy on the symbolic program part, but I think it would be entirely possible to distill the programs down into the LLM, and simply have the LLM generate/model sequences of image-tokens++program-tokens++image-tokens, and learn from any ‘neurosymbolic’ approach: https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt/comment/59334256 (And then you would have a pure LLM solver which at no point actually runs a Python program, it merely uses them as a scaffold for inner-monologue; and given results on inner-monologue distillation, likely even the Python tokens could be trained away.)