Past Account comments on DeepMind: Generally capable agents emerge from open-ended play

Past Account 29 Jul 2021 17:20 UTC
4 points
[Deleted]
- Quintin Pope 29 Jul 2021 22:01 UTC
  5 points
  Parent
  The summary says they use text and a search for “text” in the paper gives this on page 32:
  
  “In these past works, the goal usually consists of the position of the agent or a target observation to reach, however some previous work uses text goals (Colas et al., 2020) for the agent similarly to this work.”
  
  So I thought they provided goals as text. I’ll be disappointed if they don’t. Hopefully, future work will do so (and potentially use pretrained LMs to process the goal texts).
  - Past Account 30 Jul 2021 16:53 UTC
    1 point
    Parent
    [Deleted]
- brp 9 Aug 2021 4:03 UTC
  3 points
  Parent
  What’s the practical difference between “text” and one-hots of said “text”? One-hots are the standard for inputting text into models. It is only recently that we expect models to learn their preferred encoding for raw text (cf. transformers). By taking a small shortcut, the authors of this paper get to show off their agent work without loss of generality: one could still give one-hot instructions to an agent that is learning to act in the real life.