Daniel Kokotajlo comments on Fun with +12 OOMs of Compute

Daniel Kokotajlo 4 Mar 2021 21:12 UTC
LW: 2 AF: 1
AF
[6] In theory (and maybe in practice too, given how well the new pre-training paradigm is working? See also e.g. this paper) it should be easier for the model to generalize and understand concepts since it sees images and videos and hears sounds to go along with the text.