This post argues there is no inner mesa-optimizer here:
https://www.lesswrong.com/posts/avvXAvGhhGgkJDDso/caution-when-interpreting-deepmind-s-in-context-rl-paper
This post argues there is no inner mesa-optimizer here:
https://www.lesswrong.com/posts/avvXAvGhhGgkJDDso/caution-when-interpreting-deepmind-s-in-context-rl-paper