A somewhat related point: it’s only very recently (2023) that chess engines have begun competently mimicking the error patterns of human play. The nerfings of previous decades were all artificial.
I’m an FM and play casual games vs. the various nerfed engines at chess.com. The games are very fast (they move instantly) but there’s no possibility of time loss. Not the best way to practice openings but good enough.
The implication for AI / AGI is that humans will never create human-similar AI. Everything we make will be way ahead in many areas and way behind in others, and figuring out how to balance everything to construct human-similar is far in the future. Unless we get AIs to help...
The implication for AI / AGI is that humans will never create human-similar AI. Everything we make will be way ahead in many areas and way behind in others
Is this not a mere supervised learning problem? You’re saying, for some problem domain D, you want to predict the probability distribution of actions a Real Human would emit when given a particular input sample.
This is what a GPT is, it’s doing something very close to this, by predicting, from the same input text string a human was using, what they are going to type next.
We can extend this, to video, and obviously first translate video of humans to joint coordinates, and from sounds they emit back to phonemes, then do the same prediction as above.
We would expect to get an AI system from this method that approximates the average human from the sample set we trained on. This system will be multimodaland able to speak, run robotics, and emit text.
Now, after that, we train using reinforcement learning, and that feedback can clear out mistakes, so that the GPT system is now less and less likely to emit “next tokens” that the consensus for human knowledge believes is wrong. And the system never tires and the hardware never miscalculates.
And we can then use machine based RL—have robots attempt tasks in sim and IRL, autonomously grade them on how well the task was done. Have the machine attempt to use software plugins, RL feedback on errors and successful tool usage. Because the machinery can learn on a larger scale due to having more time to learn than a human lifetime, it will soon exceed human performance.
And we also have more breadth with a system like this than any single individual living human.
But I think you can see how, if you wanted to, you could probably find a solution based on the above that emulates the observable outputs of a single typical human.
A somewhat related point: it’s only very recently (2023) that chess engines have begun competently mimicking the error patterns of human play. The nerfings of previous decades were all artificial.
I’m an FM and play casual games vs. the various nerfed engines at chess.com. The games are very fast (they move instantly) but there’s no possibility of time loss. Not the best way to practice openings but good enough.
The implication for AI / AGI is that humans will never create human-similar AI. Everything we make will be way ahead in many areas and way behind in others, and figuring out how to balance everything to construct human-similar is far in the future. Unless we get AIs to help...
The implication for AI / AGI is that humans will never create human-similar AI. Everything we make will be way ahead in many areas and way behind in others
Is this not a mere supervised learning problem? You’re saying, for some problem domain D, you want to predict the probability distribution of actions a Real Human would emit when given a particular input sample.
This is what a GPT is, it’s doing something very close to this, by predicting, from the same input text string a human was using, what they are going to type next.
We can extend this, to video, and obviously first translate video of humans to joint coordinates, and from sounds they emit back to phonemes, then do the same prediction as above.
We would expect to get an AI system from this method that approximates the average human from the sample set we trained on. This system will be multimodal and able to speak, run robotics, and emit text.
Now, after that, we train using reinforcement learning, and that feedback can clear out mistakes, so that the GPT system is now less and less likely to emit “next tokens” that the consensus for human knowledge believes is wrong. And the system never tires and the hardware never miscalculates.
And we can then use machine based RL—have robots attempt tasks in sim and IRL, autonomously grade them on how well the task was done. Have the machine attempt to use software plugins, RL feedback on errors and successful tool usage. Because the machinery can learn on a larger scale due to having more time to learn than a human lifetime, it will soon exceed human performance.
And we also have more breadth with a system like this than any single individual living human.
But I think you can see how, if you wanted to, you could probably find a solution based on the above that emulates the observable outputs of a single typical human.