pretraining—is what humans do by default: Start training a model in a simple domain and then train on top of that model in broader and broader domains.
drop-out seems to be related to deliberate play
noise—you can sometimes give wrong answers or say “no” when they offer answers even if the answer is right to check how confident they are (of course giving the true answer immediately after)
zone of proximal development seems related to hyperparameters for the learning rate and size of the current model in relation to the size of the domain
explore-exploit has obvious analogs
Not exactly parenting but performance evaluation also has analogs too, and goodharting is common. Many of my counter-measure against goodharting are informed by ML:
high-dimensionality—many features allow to model the target closely
but can lead to overfitting, thus
drop-out—you let people know that a few randomly chosen features will be dropped from evaluation, so it doesn’t make sense to put all your optimization on a few targets.
altering the performance evaluation regimes each year—while still aiming at roughly the same thing.
Thanks for writing it up! I don’t know if I buy the human caregiver model, as OP said above, but I do like this way of thinking about it. Esp. the zone of proximal development thing is interesting, and for some reason I hadn’t thought about performance evaluation analogies before even though the correspondence is quite clear. Much food for thought.
I keep saying that parenting is a useful source of inspiration and insight into ML training and alignment methods but so far few people seemed to believe me. Happy to hear that you are interested. I will write up some correspondences.
I agree that it may be a useful source of insight, since it may tell you some learning techniques like this one, but I find it unlikely that this will end up involving giving the AI a “human caregiver”.
Maybe it is not the most likely scenario but a lot of mediocre AIs trained “on the job” in a close loop with humans that are not just overseers but provide a lot of real-world context doesn’t seem so unlikely in a Robin Hanson style slow takeoff.
If you have a list of such correspondences somewhere, I’d like to see it!
Here are some suggested correspondences:
pretraining—is what humans do by default: Start training a model in a simple domain and then train on top of that model in broader and broader domains.
drop-out seems to be related to deliberate play
noise—you can sometimes give wrong answers or say “no” when they offer answers even if the answer is right to check how confident they are (of course giving the true answer immediately after)
zone of proximal development seems related to hyperparameters for the learning rate and size of the current model in relation to the size of the domain
explore-exploit has obvious analogs
Not exactly parenting but performance evaluation also has analogs too, and goodharting is common. Many of my counter-measure against goodharting are informed by ML:
high-dimensionality—many features allow to model the target closely
but can lead to overfitting, thus
drop-out—you let people know that a few randomly chosen features will be dropped from evaluation, so it doesn’t make sense to put all your optimization on a few targets.
altering the performance evaluation regimes each year—while still aiming at roughly the same thing.
noise—leave the exact weights used unclear.
Thanks for writing it up! I don’t know if I buy the human caregiver model, as OP said above, but I do like this way of thinking about it. Esp. the zone of proximal development thing is interesting, and for some reason I hadn’t thought about performance evaluation analogies before even though the correspondence is quite clear. Much food for thought.
I keep saying that parenting is a useful source of inspiration and insight into ML training and alignment methods but so far few people seemed to believe me. Happy to hear that you are interested. I will write up some correspondences.
I agree that it may be a useful source of insight, since it may tell you some learning techniques like this one, but I find it unlikely that this will end up involving giving the AI a “human caregiver”.
Maybe it is not the most likely scenario but a lot of mediocre AIs trained “on the job” in a close loop with humans that are not just overseers but provide a lot of real-world context doesn’t seem so unlikely in a Robin Hanson style slow takeoff.