Okay; but the examples you gave seem to me to be more similar to compulsions than to utility functions. A person can care a lot about cars, and cars can become a major part of human society, but they’re not the point of human society- if they stop serving their purposes they’ll go the way of the horse and buggy. I’m not sure I can express the meaning I’m trying to convey cleanly using that terminology, so maybe I ought to restart.
My model of davidad’s view is that part of general intelligence, as opposed to narrow intelligence, is varied and complex goals. We could make a narrow AI which only cared about the number of paperclips in the universe, but in order to make an intelligence that’s general we need to make it also care about the future, planning, existential risk, and so on.
And so you might get a vibrant interstellar civilization of synthetic intelligences- that happens to worship paperclips, and uses them for currency and religious purposes- rather than a dead world with nothing but peculiarly bent metal.
but the examples you gave seem to me to be more similar to compulsions than to utility functions
I would have liked to use examples of plugging in clearly terminal values to a general goal achieving system. But the only current or historical general goal achieving systems are humans, and it is notoriously difficult to figure out what humans’ terminal values are.
My model of davidad’s view is that part of general intelligence, as opposed to narrow intelligence, is varied and complex goals. We could make a narrow AI which only cared about the number of paperclips in the universe, but in order to make an intelligence that’s general we need to make it also care about the future, planning, existential risk, and so on.
I am not claiming that you could give an AGI an arbitrary goal system that suppresses the “Basic AI Drives”, but that those drives will be effective instrumental values, not lost purposes, and while a paperclip maximizing AGI will have sub goals such as controlling resources and improving its ability to predict the future, the achievement of those goals will help it to actually produce paperclips.
I am not claiming that you could give an AGI an arbitrary goal system that suppresses the “Basic AI Drives”, but that those drives will be effective instrumental values, not lost purposes, and while a paperclip maximizing AGI will have sub goals such as controlling resources and improving its ability to predict the future, the achievement of those goals will help it to actually produce paperclips.
It sounds like we agree: paperclips could be a genuine terminal value for AGIs, but a dead future doesn’t seem all that likely from AGIs (though it might be likely from AIs in general).
a dead future doesn’t seem all that likely from AGIs
What? A paperclip AGI with first mover advantage would self-improve beyond the point where cooperating with humans has any instrumental value, become a singleton, and tile the universe with paperclips.
What? A paperclip AGI with first mover advantage would self-improve beyond the point where cooperating with humans has any instrumental value, become a singleton, and tile the universe with paperclips.
Oh, I agree that humans die in such a scenario, but I don’t think the ‘tile the universe’ part counts as “dead” if the AGI has AI drives.
Okay; but the examples you gave seem to me to be more similar to compulsions than to utility functions. A person can care a lot about cars, and cars can become a major part of human society, but they’re not the point of human society- if they stop serving their purposes they’ll go the way of the horse and buggy. I’m not sure I can express the meaning I’m trying to convey cleanly using that terminology, so maybe I ought to restart.
My model of davidad’s view is that part of general intelligence, as opposed to narrow intelligence, is varied and complex goals. We could make a narrow AI which only cared about the number of paperclips in the universe, but in order to make an intelligence that’s general we need to make it also care about the future, planning, existential risk, and so on.
And so you might get a vibrant interstellar civilization of synthetic intelligences- that happens to worship paperclips, and uses them for currency and religious purposes- rather than a dead world with nothing but peculiarly bent metal.
I would have liked to use examples of plugging in clearly terminal values to a general goal achieving system. But the only current or historical general goal achieving systems are humans, and it is notoriously difficult to figure out what humans’ terminal values are.
I am not claiming that you could give an AGI an arbitrary goal system that suppresses the “Basic AI Drives”, but that those drives will be effective instrumental values, not lost purposes, and while a paperclip maximizing AGI will have sub goals such as controlling resources and improving its ability to predict the future, the achievement of those goals will help it to actually produce paperclips.
It sounds like we agree: paperclips could be a genuine terminal value for AGIs, but a dead future doesn’t seem all that likely from AGIs (though it might be likely from AIs in general).
What? A paperclip AGI with first mover advantage would self-improve beyond the point where cooperating with humans has any instrumental value, become a singleton, and tile the universe with paperclips.
Oh, I agree that humans die in such a scenario, but I don’t think the ‘tile the universe’ part counts as “dead” if the AGI has AI drives.