I’m still wondering how you’d calculate a CEV. I’m still wondering how you’d calculate one human’s volition. Hands up all those who know their own utility function. … OK, how do you know you’ve got it right?
For each action consider the possible consequences.
Take the action most likely to meet the goals.
Update the world model based on what actually happens.
Step 1 seems to be a fairly basic and fundamental one to me.
If you don’t know what you are trying to do, it is not easy to know whether you are succeeding at it—or not.
I think rational agents should try and figure out what they want. “It’s complicated” is a kind of answer—but not a very practical or useful one.
I suspect failures at step 1 are mostly to do with signalling. For example, Tom Lehrer once spoke of “a man whose allegiance is ruled by expedience”. If you publish your goals, that limits your ability to signal motives that are in harmony with those of your audience—reducing your options for deceptive signalling.
I suspect failures at step 1 are mostly to do with signalling. For example, Tom Lehrer once spoke of “a man whose allegiance is ruled by expedience”. If you publish your goals, that limits your ability to signal motives that are in harmony with those of your audience—reducing your options for deceptive signalling.
I doubt this very strongly, and can say with the an extremely high level of confidence that I fail at step 1 even when I don’t have to tell anyone my answer. What humans mean by “clearly specified” is completely different than what is required to write down some program whose output you would like an omnipotent agent to maximize. There are other reasons to argue that CEV is not a useful concept, but this is really a bad one.
FWIW, I wasn’t talking about CEV or superintelligent agents. I was just talking about the task of figuring out what your own goals were.
We can’t really coherently discuss in detail the difficulties of programming goals into superintelligent agents until we know how to build them. Programming one agent’s goals into a different agent looks challenging. Some devotees attempt to fulfill their guru’s desires—but that is a trickier problem than fulfilling their own desires—since they don’t get direct feedback from the guru’s senses. Anyway, these are all complications that I did not even pretend to be going into.
What do you actually mean when you say you “fail at step 1”. You have no idea what your own goals are?!? Or just that your knowledge of your own goals is somewhat incomplete?
I wasn’t talking about CEV or superintelligent agents either. I mean that I have no idea how to write down my own goals. I am nowhere close to having clearly specified goals for myself, in the sense that I as a mathematician usually mean “clearly specified”. The fact that I can’t describe my goals well enough that I could tell them to someone else and trust them to do what I want done is just one indication that my own conception of my goals is significantly incomplete.
OK. You do sound as though you don’t have very clearly-defined goals—though maybe there is some evasive word-play around the issue of what counts as a “clear” specification. Having goals is not rocket science! In any case, IMO, you would be well advised to start at number 1 on the above list.
How can you possibly get what you want if you don’t know what it is? It doesn’t matter whether you are looking to acquire wealth, enjoy excellent relationships, become more spiritual, etc. To get anywhere in life, you need to know “exactly” what you want.
The simplest known complete representation of my utility function is my brain, combined with some supporting infrastructure and a question-asking procedure. Any utility function that is substantially simpler than my brain has almost certainly left something out.
The first step in extracting a human’s volition is to develop a complete understanding of the brain, including the ability to simulate it. We are currently stuck there. We have some high-level approximations that work around needing this understanding, but their accuracy is questionable.
The intelligence to calculate the CEV needs to be pre-FOOM.
No, a complete question to which CEV is an answer needs to be pre-FOOM. All an AI needs to know about morality before it is superintelligent is (1) how to arrive at a CEV-answer by looking at things and doing calculations and (2) how to look at things without breaking them and do calculations without breaking everything else.
I’m still wondering how you’d calculate a CEV. I’m still wondering how you’d calculate one human’s volition. Hands up all those who know their own utility function. … OK, how do you know you’ve got it right?
I don’t think anyone here would have raised their hand at the first prompt, with the exceptions of Tim Tyler and Clippy.
Omohundro’s rationality in a nutshell reads:
Have clearly specified goals.
In any situation, identify the possible actions.
For each action consider the possible consequences.
Take the action most likely to meet the goals.
Update the world model based on what actually happens.
Step 1 seems to be a fairly basic and fundamental one to me.
If you don’t know what you are trying to do, it is not easy to know whether you are succeeding at it—or not.
I think rational agents should try and figure out what they want. “It’s complicated” is a kind of answer—but not a very practical or useful one.
I suspect failures at step 1 are mostly to do with signalling. For example, Tom Lehrer once spoke of “a man whose allegiance is ruled by expedience”. If you publish your goals, that limits your ability to signal motives that are in harmony with those of your audience—reducing your options for deceptive signalling.
I doubt this very strongly, and can say with the an extremely high level of confidence that I fail at step 1 even when I don’t have to tell anyone my answer. What humans mean by “clearly specified” is completely different than what is required to write down some program whose output you would like an omnipotent agent to maximize. There are other reasons to argue that CEV is not a useful concept, but this is really a bad one.
FWIW, I wasn’t talking about CEV or superintelligent agents. I was just talking about the task of figuring out what your own goals were.
We can’t really coherently discuss in detail the difficulties of programming goals into superintelligent agents until we know how to build them. Programming one agent’s goals into a different agent looks challenging. Some devotees attempt to fulfill their guru’s desires—but that is a trickier problem than fulfilling their own desires—since they don’t get direct feedback from the guru’s senses. Anyway, these are all complications that I did not even pretend to be going into.
What do you actually mean when you say you “fail at step 1”. You have no idea what your own goals are?!? Or just that your knowledge of your own goals is somewhat incomplete?
I wasn’t talking about CEV or superintelligent agents either. I mean that I have no idea how to write down my own goals. I am nowhere close to having clearly specified goals for myself, in the sense that I as a mathematician usually mean “clearly specified”. The fact that I can’t describe my goals well enough that I could tell them to someone else and trust them to do what I want done is just one indication that my own conception of my goals is significantly incomplete.
OK. You do sound as though you don’t have very clearly-defined goals—though maybe there is some evasive word-play around the issue of what counts as a “clear” specification. Having goals is not rocket science! In any case, IMO, you would be well advised to start at number 1 on the above list.
http://www.best-self-help-sites.com/goal-setting.html
The simplest known complete representation of my utility function is my brain, combined with some supporting infrastructure and a question-asking procedure. Any utility function that is substantially simpler than my brain has almost certainly left something out.
The first step in extracting a human’s volition is to develop a complete understanding of the brain, including the ability to simulate it. We are currently stuck there. We have some high-level approximations that work around needing this understanding, but their accuracy is questionable.
Assume you have a working simulation. What next?
The implication is that you let the FAI do it. (The task in building the FAI so that it will do this has somewhat different challenges.)
Yes, but saying “we get the FAI to do it” is just moving the hard bit.
The intelligence to calculate the CEV needs to be pre-FOOM. We have general intelligences of pre-FOOM level already.
So: what would a pre-FOOM general intelligence actually do?
No, a complete question to which CEV is an answer needs to be pre-FOOM. All an AI needs to know about morality before it is superintelligent is (1) how to arrive at a CEV-answer by looking at things and doing calculations and (2) how to look at things without breaking them and do calculations without breaking everything else.
Ah, OK. So do we have any leads on how to ask the question?
I believe the idea is the have the pre-FOOM AI commit to doing the calculation first thing post-FOOM.
Eliezer beat me to the punch. My answer was approximately the same.