Thank you for your detailed feedback. I agree that evolution doesn’t care about anything, but i think that baby-eater aliens would not think that way. They can probably think about evolution aligning them to eat babies, but in their case it is an alignment of their values to them, not to any other agent/entity.
In our story we somehow care about somebody else, and it is their story that ends up with the “happy end”. I also agree that probably given enough time we will end up stop caring about babies who we think can not reproduce anymore, but it will be a much more complex solution.
At the first step it is probably much easier to just “make an animal who cares about it babies no matter what”, otherwise you will have to count on ability of that animal to recognize something it might not even understand (like reproductive abilities of a baby)
Ah, I see what you mean and that I made a mistake—I didn’t understand how your post was about human mothers being aligned with their children, not just with evolution.
To some extent I think my comment makes sense as a reply, because trying to optimize[1] a black-box optimizer for fitness of a “simulated child” is still going to end up with the “mother” executing kludgy strategies, rather than recapitulating evolution to arrive at human-like values.
EDIT: Of course my misunderstanding makes most my attempt to psychologize you totally false.
But my comment also kinda doesn’t make sense, because since I didn’t understand your post I somewhat-glaringly don’t mention other key considerations. For example: mothers who love their children still want other things too, so how are we picking out what parts of their desires are “love for children”? Doing this requires an abstract model of the world, and that abstract model might “cheat” a little by treating love as a simple thing that corresponds to optimizing for the child’s own values, even if it’s messy and human.
A related pitfall is if you’re training an AI to take care of a simulated child, thinking about this process using the abstract model we use to think about mothers loving their children will treat “love” as a simple concept that the AI might hit upon all at once. But that intuitive abstract model will not treat ruthlessly exploiting the simulate child’s programming to get a high score by pushing it outside of its intended context as something simple, even though that might happen.
Thank you for your detailed feedback. I agree that evolution doesn’t care about anything, but i think that baby-eater aliens would not think that way. They can probably think about evolution aligning them to eat babies, but in their case it is an alignment of their values to them, not to any other agent/entity.
In our story we somehow care about somebody else, and it is their story that ends up with the “happy end”. I also agree that probably given enough time we will end up stop caring about babies who we think can not reproduce anymore, but it will be a much more complex solution.
At the first step it is probably much easier to just “make an animal who cares about it babies no matter what”, otherwise you will have to count on ability of that animal to recognize something it might not even understand (like reproductive abilities of a baby)
Ah, I see what you mean and that I made a mistake—I didn’t understand how your post was about human mothers being aligned with their children, not just with evolution.
To some extent I think my comment makes sense as a reply, because trying to optimize[1] a black-box optimizer for fitness of a “simulated child” is still going to end up with the “mother” executing kludgy strategies, rather than recapitulating evolution to arrive at human-like values.
EDIT: Of course my misunderstanding makes most my attempt to psychologize you totally false.
But my comment also kinda doesn’t make sense, because since I didn’t understand your post I somewhat-glaringly don’t mention other key considerations. For example: mothers who love their children still want other things too, so how are we picking out what parts of their desires are “love for children”? Doing this requires an abstract model of the world, and that abstract model might “cheat” a little by treating love as a simple thing that corresponds to optimizing for the child’s own values, even if it’s messy and human.
A related pitfall is if you’re training an AI to take care of a simulated child, thinking about this process using the abstract model we use to think about mothers loving their children will treat “love” as a simple concept that the AI might hit upon all at once. But that intuitive abstract model will not treat ruthlessly exploiting the simulate child’s programming to get a high score by pushing it outside of its intended context as something simple, even though that might happen.
especially with evolution, but also with gradient descent