I wonder, since it’s important to stay pragmatic, if it would be good to design a “toy example” for this sort of ethics.
It seems like the hard problem here is to infer reasons for action, from an individual’s actions. People do all sorts of things; but how can you tell from those choices what they really value? Can you infer a utility function from people’s choices, or are there sets of choices that don’t necessarily follow any utility function?
The sorts of “toy” examples I’m thinking of here are situations where the agent has a finite number of choices. Let’s say you have Pac-Man in a maze. His choices are his movements in four cardinal directions. You watch Pac-Man play many games; you see what he does when he’s attacked by a ghost; you see what he does when he can find something tasty to eat; you see when he’s willing to risk the danger to get the food.
From this, I imagine you could do some hidden Markov stuff to infer a model of Pac-Man’s behavior—perhaps an if-then tree.
Could you guess from this tree that Pac-Man likes fruit and dislikes dying, and goes away from fruit only when he needs to avoid dying? Yeah, you could (though I don’t know how to systematize that more broadly.)
From this, could you do an “extrapolated” model of what Pac-Man would do if he knew when and where the ghosts were coming? Sure—and that would be, if I’ve understood correctly, CEV for Pac-Man.
It seems to me that, more subtle philosophy aside, this is what we’re trying to do. I haven’t read the literature lukeprog has, but it seems to me that Pac-Man’s “reasons for actions” are completely described by that if-then tree of his behavior. Why didn’t he go left that time? Because there was a ghost there. Why does that matter? Because Pac-Man always goes away from ghosts. (You could say: Pac-Man desires to avoid ghosts.)
It also seems to me, not that I really know this line of work, that one incremental thing that can be done towards CEV (or some other sort of practical metaethics) is this kind of toy model. Yes, ultimately understanding human motivation is a huge psychology and neuroscience problem, but before we can assimilate those quantities of data we may want to make sure we know what to do in the simple cases.
Could you guess from this tree that Pac-Man likes fruit and dislikes dying, and goes away from fruit only when he needs to avoid dying? Yeah, you could (though I don’t know how to systematize that more broadly.)
Something like:
Run simulations of agents that can chose randomly out of the same actions as the agent has. Look for regularities in the world state that occur more or less frequently in the sensible agent compared to random agent. Those things could be said to be what it likes and dislikes respectively.
To determine terminal vs instrumental values look at the decision tree and see which of the states gets chosen when a choice is forced.
Perhaps the next step would be to add to the model a notion of second-order desire, or analyze a Pac-Man whose apparent terminal values can change when they’re exposed to certain experiences or moral arguments.
I wonder, since it’s important to stay pragmatic, if it would be good to design a “toy example” for this sort of ethics.
It seems like the hard problem here is to infer reasons for action, from an individual’s actions. People do all sorts of things; but how can you tell from those choices what they really value? Can you infer a utility function from people’s choices, or are there sets of choices that don’t necessarily follow any utility function?
The sorts of “toy” examples I’m thinking of here are situations where the agent has a finite number of choices. Let’s say you have Pac-Man in a maze. His choices are his movements in four cardinal directions. You watch Pac-Man play many games; you see what he does when he’s attacked by a ghost; you see what he does when he can find something tasty to eat; you see when he’s willing to risk the danger to get the food.
From this, I imagine you could do some hidden Markov stuff to infer a model of Pac-Man’s behavior—perhaps an if-then tree.
Could you guess from this tree that Pac-Man likes fruit and dislikes dying, and goes away from fruit only when he needs to avoid dying? Yeah, you could (though I don’t know how to systematize that more broadly.)
From this, could you do an “extrapolated” model of what Pac-Man would do if he knew when and where the ghosts were coming? Sure—and that would be, if I’ve understood correctly, CEV for Pac-Man.
It seems to me that, more subtle philosophy aside, this is what we’re trying to do. I haven’t read the literature lukeprog has, but it seems to me that Pac-Man’s “reasons for actions” are completely described by that if-then tree of his behavior. Why didn’t he go left that time? Because there was a ghost there. Why does that matter? Because Pac-Man always goes away from ghosts. (You could say: Pac-Man desires to avoid ghosts.)
It also seems to me, not that I really know this line of work, that one incremental thing that can be done towards CEV (or some other sort of practical metaethics) is this kind of toy model. Yes, ultimately understanding human motivation is a huge psychology and neuroscience problem, but before we can assimilate those quantities of data we may want to make sure we know what to do in the simple cases.
Something like:
Run simulations of agents that can chose randomly out of the same actions as the agent has. Look for regularities in the world state that occur more or less frequently in the sensible agent compared to random agent. Those things could be said to be what it likes and dislikes respectively.
To determine terminal vs instrumental values look at the decision tree and see which of the states gets chosen when a choice is forced.
Thanks. Come to think of it that’s exactly the right answer.
Perhaps the next step would be to add to the model a notion of second-order desire, or analyze a Pac-Man whose apparent terminal values can change when they’re exposed to certain experiences or moral arguments.