Well, one story is that humans and brains are irrational, and then you don’t need a utility function or any other specific description of how it works. Just figure out what’s really there and model it.
The other story is that we’re hoping to make a Friendly AI that might make rational decisions to help people get what they want in some sense. The only way I can see to do that is to model people as though they actually want something, which seems to imply having a utility function that says what they want more and what they want less. Yes, it’s not true, people aren’t that rational, but if a FAI or anyone else is going to help you get what you want, it has to model you as wanting something (and as making mistakes when you don’t behave as though you want something).
So it comes down to this question: If I model you as using some parallel decision theory, and I want to help you get what you want, how do I extract “what you want” from the model without first somehow converting that model to one that has a utility function?
Make sure that each CSA above the lowest level actually has “could”, “should”, and “would” labels on the nodes in its problem space, and make sure that those labels, their values, and the problem space itself can be reduced to the managing of the CSAs on the level below.
Figuring out exactly how it is that our preferences, i.e., our utility function, emerge from the managing of our subagents is my main motivation for suggesting the construction of a parallel decision theory, as well as understanding how our problem space emerges from the managing of other CSAs.
Make sure that each CSA above the lowest level actually has “could”, “should”, and “would” labels on the nodes in its problem space, and make sure that those labels, their values, and the problem space itself can be reduced to the managing of the CSAs on the level below.
That statement would be much more useful if you gave a specific example. I don’t see how labels on the nodes are supposed to influence the final result.
There’s a general principle here that I wish I could state well. It’s something like “general ideas are easy, specific workable proposals are hard, and you’re probably wasting people’s time if you’re only describing a solution to the easy parts of the problem”.
One cause of this is that anyone who can solve the hard part of the problem can probably already guess the easy part, so they don’t benefit much from you saying it. Another cause is that the solutions to the hard parts of the problem tend to have awkward aspects to them that are best dealt with by modifying the easy part, so a solution to just the easy part is sure to be unworkable in ways that can’t be seen if that’s all you have.
I have this issue with your original post, and most of the FAI work that’s out there.
Well, you can see how labeling some nodes as “can” and some as “can’t” could be useful, I’m sure. The “woulds” tell the agent what it can do from a given node, i.e., what nodes are connected to this node and how much pay off there is for choosing this node. The should labels are calculated from the “woulds” and utility function, i.e., the should label’s value tells the agent to take the action or not take it.
I’m not trying to solve the specific parts of the problem in this post, I’m trying to pose the problem so that we can work out the specific parts; that’s why it is called “Towards a New Decision Theory for Parallel Agents” rather than “A New Decision Theory for Parallel Agents”. That’s also why I am trying to assemble a team of people to help me work out the more specific parts of the problem. What you quoted was a suggestion I gave to any team that might pursue the goal of an axiomatic parallel decision theory independently, any such team, I would imagine, would understand what I meant by it, specially if they looked at the link right above it:
Well, one story is that humans and brains are irrational, and then you don’t need a utility function or any other specific description of how it works. Just figure out what’s really there and model it.
The other story is that we’re hoping to make a Friendly AI that might make rational decisions to help people get what they want in some sense. The only way I can see to do that is to model people as though they actually want something, which seems to imply having a utility function that says what they want more and what they want less. Yes, it’s not true, people aren’t that rational, but if a FAI or anyone else is going to help you get what you want, it has to model you as wanting something (and as making mistakes when you don’t behave as though you want something).
So it comes down to this question: If I model you as using some parallel decision theory, and I want to help you get what you want, how do I extract “what you want” from the model without first somehow converting that model to one that has a utility function?
That’s suggestion five on the list:
Figuring out exactly how it is that our preferences, i.e., our utility function, emerge from the managing of our subagents is my main motivation for suggesting the construction of a parallel decision theory, as well as understanding how our problem space emerges from the managing of other CSAs.
That statement would be much more useful if you gave a specific example. I don’t see how labels on the nodes are supposed to influence the final result.
There’s a general principle here that I wish I could state well. It’s something like “general ideas are easy, specific workable proposals are hard, and you’re probably wasting people’s time if you’re only describing a solution to the easy parts of the problem”.
One cause of this is that anyone who can solve the hard part of the problem can probably already guess the easy part, so they don’t benefit much from you saying it. Another cause is that the solutions to the hard parts of the problem tend to have awkward aspects to them that are best dealt with by modifying the easy part, so a solution to just the easy part is sure to be unworkable in ways that can’t be seen if that’s all you have.
I have this issue with your original post, and most of the FAI work that’s out there.
Well, you can see how labeling some nodes as “can” and some as “can’t” could be useful, I’m sure. The “woulds” tell the agent what it can do from a given node, i.e., what nodes are connected to this node and how much pay off there is for choosing this node. The should labels are calculated from the “woulds” and utility function, i.e., the should label’s value tells the agent to take the action or not take it.
I’m not trying to solve the specific parts of the problem in this post, I’m trying to pose the problem so that we can work out the specific parts; that’s why it is called “Towards a New Decision Theory for Parallel Agents” rather than “A New Decision Theory for Parallel Agents”. That’s also why I am trying to assemble a team of people to help me work out the more specific parts of the problem. What you quoted was a suggestion I gave to any team that might pursue the goal of an axiomatic parallel decision theory independently, any such team, I would imagine, would understand what I meant by it, specially if they looked at the link right above it:
http://lesswrong.com/lw/174/decision_theory_why_we_need_to_reduce_could_would/ .