Jsevillamol comments on “AI and Compute” trend isn’t predictive of what is happening

Jsevillamol 16 Aug 2021 14:31 UTC
LW: 1 AF: 1
AF
One more question: for the BigGAN which model do your calculations refer to?
Could it be the 256x256 deep version?
- alexlyzhov 19 Aug 2021 18:08 UTC
  2 points
  Parent
  Ohh OK I think since I wrote “512 TPU cores” it’s 512x512, because in Appendix C here https://arxiv.org/pdf/1809.11096.pdf they say it corresponds to 512x512.
  - Jsevillamol 20 Aug 2021 9:20 UTC
    1 point
    Parent
    Deep or shallow version?
    - alexlyzhov 21 Aug 2021 22:58 UTC
      2 points
      Parent
      “Training takes between 24 and 48 hours for most models”; I assumed both are trained within 48 hours (even though this is not precise and may be incorrect).