Just want to point to a more recent (2021) paper implementing adaptive computation by some DeepMind researchers that I found interesting when I was looking into this:
https://arxiv.org/pdf/2107.05407.pdf
Just want to point to a more recent (2021) paper implementing adaptive computation by some DeepMind researchers that I found interesting when I was looking into this:
https://arxiv.org/pdf/2107.05407.pdf