Ethan, my impression is that you’re mildly overfitting. I appreciate your intellectual arrogance quite a bit; it’s a great attitude to have as a researcher, and more folks here should have attitudes like yours, IMO. But, I’d expect that data causal isolation quality is going to throw a huge honkin wrench into any expectations we form about how we can use strong models—note that even humans who have low causal quality training data form weird and false superstitions! I agree with the “test loss != capability” claim because the test distribution is weird and made up and doesn’t exist outside the original dataset. IID is catastrophically false, and figuring that out is the key limiter preventing robotics from matching pace with the rest of ML/AI right now, imo. So, your scaling model might even be a solid representation space, but it’s misleading because of the correlation problem.
https://discord.com/channels/729741769192767510/785968841301426216/958570285760647230
Ethan posts an annotated image from openai’s paper https://arxiv.org/pdf/2001.08361.pdf , stating that it’s “apparently wrong now” after the compute-efficient scaling laws paper from deepmind: https://cdn.discordapp.com/attachments/785968841301426216/958570284665946122/Screen_Shot_2021-10-20_at_12.30.58_PM_1.png—the screenshot claims that the crossover point between data and compute in the original openai paper predicts agi.
Ethan, my impression is that you’re mildly overfitting. I appreciate your intellectual arrogance quite a bit; it’s a great attitude to have as a researcher, and more folks here should have attitudes like yours, IMO. But, I’d expect that data causal isolation quality is going to throw a huge honkin wrench into any expectations we form about how we can use strong models—note that even humans who have low causal quality training data form weird and false superstitions! I agree with the “test loss != capability” claim because the test distribution is weird and made up and doesn’t exist outside the original dataset. IID is catastrophically false, and figuring that out is the key limiter preventing robotics from matching pace with the rest of ML/AI right now, imo. So, your scaling model might even be a solid representation space, but it’s misleading because of the correlation problem.