To be clear, I broadly agree that AGI will be quite underparameterized, but still maintain that double descent demonstrates something—that larger models can do better by being simpler not just by fitting more data—that I think is still quite important.
To be clear, I broadly agree that AGI will be quite underparameterized, but still maintain that double descent demonstrates something—that larger models can do better by being simpler not just by fitting more data—that I think is still quite important.