Is it true? We need to pour lifetimes of information to get moderate expertize-level performance in SOTA models. I have no significant doubt that we can overcome this via scaling, but with correction on available compute, brains seem to be decent learners.
In addition, I would say that here is a difference between learning capability and elicting it: current models seem to be very sensitive to prompts, wrappnigs and other conditions. It’s possible that intelligence gains can come from easier eliciting of already learned capabilities but blocked by, say, social RLHF.
Is it true? We need to pour lifetimes of information to get moderate expertize-level performance in SOTA models. I have no significant doubt that we can overcome this via scaling, but with correction on available compute, brains seem to be decent learners.
In addition, I would say that here is a difference between learning capability and elicting it: current models seem to be very sensitive to prompts, wrappnigs and other conditions. It’s possible that intelligence gains can come from easier eliciting of already learned capabilities but blocked by, say, social RLHF.
Current AI is less sample efficient, but that is mostly irrelevant as the effective speed is 1000x to 10000x greater.
By the time current human infants finish ~30 year biological training, we’ll by long past AGI and approaching singularity (in hyperexpoential models).