Non-central nitpick:
As it turns out, transformers can do reinforcement learning in-context
This seems to just be vanilla in-context learning, rather than any sort of in-context RL. (Also I’m skeptical that the linked paper actually provides evidence of in-context RL in any nontrivial sense.)
Non-central nitpick:
This seems to just be vanilla in-context learning, rather than any sort of in-context RL. (Also I’m skeptical that the linked paper actually provides evidence of in-context RL in any nontrivial sense.)