Sam Marks comments on Mysteries of mode collapse

Sam Marks 8 Nov 2022 20:53 UTC
3 points
2
Non-central nitpick:
As it turns out, transformers can do reinforcement learning in-context
This seems to just be vanilla in-context learning, rather than any sort of in-context RL. (Also I’m skeptical that the linked paper actually provides evidence of in-context RL in any nontrivial sense.)