It’s a subjective matter whether the above is successful ASCII art balloon or not. If we hold GPT to the same standards we do for text generation, I think we can safely say the above depiction is a miserable failure. The lack of symmetry and overall childishness of it suggests it has understood nothing about the spatiality and only by random luck manages to approximate something it has explicitly seen in the training data. I’ve done a fair bit of repeated generations and they all come out poorly). I think the Transformer paper was interesting as well, although they do mention that it only works when there is a large amount of training data. Otherwise, the inductive biases of CNNs do have their advantages, and combining both is probably superior since the added computational burden of a CNN in conjunction with a Transformer is hardly worth talking about.
It’s a subjective matter whether the above is successful ASCII art balloon or not. If we hold GPT to the same standards we do for text generation, I think we can safely say the above depiction is a miserable failure. The lack of symmetry and overall childishness of it suggests it has understood nothing about the spatiality and only by random luck manages to approximate something it has explicitly seen in the training data. I’ve done a fair bit of repeated generations and they all come out poorly). I think the Transformer paper was interesting as well, although they do mention that it only works when there is a large amount of training data. Otherwise, the inductive biases of CNNs do have their advantages, and combining both is probably superior since the added computational burden of a CNN in conjunction with a Transformer is hardly worth talking about.