You seem to think that I’m claiming that UDT’s notion of utility function is the only way real-world goals might be implemented in an AGI. I’m instead suggesting that it is one way to do so. It currently seems to be the most promising approach for FAI, but I certainly wouldn’t say that only AIs using UDT can be said to have real-world goals.
Then you having formalized your utility function has nothing to do with allegations of vagueness when it comes to defining the utility in the argument of how utility maximizers are dangerous. With regards to it being ‘the most promising approach’, I think it is a very, very silly idea to have an approach so general that we all may well end up sacrificed in the name of huge number of imaginary beings that might exist, an AI pascal-wagering itself on it’s own. It looks like a dead end, especially for friendliness.
At this point I’m wondering if Nick’s complaint of vagueness was about this more general usage of “goals”. It’s unclear from reading his comment, but in case it is, I can try to offer a definition: an AI can be said to have real-world goals if it tries to (and generally succeeds at) modeling its environment and chooses actions based on their predicted effects on its environment.
This does necessarily work like ‘I want most paperclips to exist therefore I will talk my way into controlling the world, then kill everyone and make paperclips’, though.
Goals in this sense seems to be something that AGI researchers actively pursue, presumably because they think it will make their AGIs more useful or powerful or intelligent. If you read Goertzel’s papers, he certainly talks about “goals”, “perceptions”, “actions”, “movement commands”, etc.
They also don’t try to make goals that couldn’t be outsmarted into nihilism. We humans sort-of have a goal of reproduction, except we’re too clever, and we use birth control.
In your UDT, the actual intelligent component is this mathematical intuition that you’d use to process this theory in reasonable time. The rest is optional and highly difficult (if not altogether impossible) icing, even for the most trivial goal such as paperclips, which may well in principle never work.
And the technologies employed in the intelligent component are, without any of those goals, and with much less intelligence (as in computing power and their optimality) requirement, sufficient for e.g. using them to design machinery for mind uploading.
Furthermore, and that is the most ridiculous thing, there is this ‘oracle AI’ being talked about, where an answering system is modelled as based on real world goals and real world utilities, as if those were somehow primal and universally applicable.
It seems to me that the goals and utilities are just an useful rhetorical device used to trigger anthropomorphization fallacy at will (in a selective way), as to solicit donations.
They also don’t try to make goals that couldn’t be outsmarted into nihilism.
They’re not explicitly trying to solve this problem because they don’t think it’s going to be a problem with their current approach of implementing goals. But suppose you’re right and they’re wrong, and somebody that wants to build a AGI ends up implementing a motivational system that outsmarts itself into nihilism. Well such an AGI isn’t very useful so wouldn’t they just keep trying until they stumble onto a motivational system that isn’t so prone to nihilism?
We humans sort-of have a goal of reproduction, except we’re too clever, and we use birth control.
Similarly, if we let evolution of humans continue, wouldn’t humans pretty soon have a motivational system for reproduction that we won’t want to cleverly work around?
They’re not explicitly trying to solve this problem because they don’t think it’s going to be a problem with their current approach of implementing goals.
They do not expect foom either.
Well such an AGI isn’t very useful
You can still have formally defined goals—satisfy conditions on equations, et cetera. Defined internally, without the problematic real world component. Use this for e.g. designing reliable cellular machinery (‘cure cancer and senescence’). Seems very useful to me.
so wouldn’t they just keep trying until they stumble onto a motivational system that isn’t so prone to nihilism?
How long would it take you to ‘stumble’ upon some goal for the UDT that translates to something actually real?
Similarly, if we let evolution of humans continue, wouldn’t humans pretty soon have a motivational system for reproduction that we won’t want to cleverly work around?
The evolution destructively tests designs against reality. Humans do have various motivational systems there, such as religion, btw.
I am not sure how you think a motivational system for reproduction could work, so that we would not embrace a solution that actually does not result in reproduction. (Given sufficient intelligence)
You can still have formally defined goals—satisfy conditions on equations, et cetera.
As I mentioned, there are AGI researchers trying to implement real-world goals right now. If they build an AGI that turns nihilistic, do you think they will just give up and start working on equation solvers instead, or try to “fix” their AGI?
How long would it take you to ‘stumble’ upon some goal for the UDT that translates to something actually real?
I guess probably not very long, if I had a working solution to “math intuition”, a sufficiently powerful computer to experiment with, and no concerns for safety...
Then you having formalized your utility function has nothing to do with allegations of vagueness when it comes to defining the utility in the argument of how utility maximizers are dangerous. With regards to it being ‘the most promising approach’, I think it is a very, very silly idea to have an approach so general that we all may well end up sacrificed in the name of huge number of imaginary beings that might exist, an AI pascal-wagering itself on it’s own. It looks like a dead end, especially for friendliness.
This does necessarily work like ‘I want most paperclips to exist therefore I will talk my way into controlling the world, then kill everyone and make paperclips’, though.
They also don’t try to make goals that couldn’t be outsmarted into nihilism. We humans sort-of have a goal of reproduction, except we’re too clever, and we use birth control.
In your UDT, the actual intelligent component is this mathematical intuition that you’d use to process this theory in reasonable time. The rest is optional and highly difficult (if not altogether impossible) icing, even for the most trivial goal such as paperclips, which may well in principle never work.
And the technologies employed in the intelligent component are, without any of those goals, and with much less intelligence (as in computing power and their optimality) requirement, sufficient for e.g. using them to design machinery for mind uploading.
Furthermore, and that is the most ridiculous thing, there is this ‘oracle AI’ being talked about, where an answering system is modelled as based on real world goals and real world utilities, as if those were somehow primal and universally applicable.
It seems to me that the goals and utilities are just an useful rhetorical device used to trigger anthropomorphization fallacy at will (in a selective way), as to solicit donations.
They’re not explicitly trying to solve this problem because they don’t think it’s going to be a problem with their current approach of implementing goals. But suppose you’re right and they’re wrong, and somebody that wants to build a AGI ends up implementing a motivational system that outsmarts itself into nihilism. Well such an AGI isn’t very useful so wouldn’t they just keep trying until they stumble onto a motivational system that isn’t so prone to nihilism?
Similarly, if we let evolution of humans continue, wouldn’t humans pretty soon have a motivational system for reproduction that we won’t want to cleverly work around?
They do not expect foom either.
You can still have formally defined goals—satisfy conditions on equations, et cetera. Defined internally, without the problematic real world component. Use this for e.g. designing reliable cellular machinery (‘cure cancer and senescence’). Seems very useful to me.
How long would it take you to ‘stumble’ upon some goal for the UDT that translates to something actually real?
The evolution destructively tests designs against reality. Humans do have various motivational systems there, such as religion, btw.
I am not sure how you think a motivational system for reproduction could work, so that we would not embrace a solution that actually does not result in reproduction. (Given sufficient intelligence)
Goertzel does, or at least thinks it’s possible. See http://lesswrong.com/lw/aw7/muehlhausergoertzel_dialogue_part_1/ where he says “GOLEM is a design for a strongly self-modifying superintelligent AI system”. Also http://novamente.net/AAAI04.pdf where he talks about Novamente potentially being “thoroughly self-modifying and self-improving general intelligence”.
As I mentioned, there are AGI researchers trying to implement real-world goals right now. If they build an AGI that turns nihilistic, do you think they will just give up and start working on equation solvers instead, or try to “fix” their AGI?
I guess probably not very long, if I had a working solution to “math intuition”, a sufficiently powerful computer to experiment with, and no concerns for safety...