It gets interesting when the pebblesorters turn on a correctly functioning FAI, which starts telling them that they should build a pile of 108301 and legislative bodies spend the next decade debating whether or not it is in fact a correct pile. “How does this AI know better anyway? That looks new and strange.” “That doesn’t sound correct to me at all. You’d have to be crazy to build 108301. It’s so different from 2029! It’s a slippery slope to 256!” And so on.
This really is a fantastic parable—it shows off perhaps a dozen different aspects of the forrest we were missing for the trees.
When I read this parable, I was already looking for a reason to understand why Friendly AI necessarily meant “friendly to human interests or with respect to human moral systems”. Hence, my conclusion from this parable was that Eliezer was trying to show how, from the perspective of AGI, human goals and ambitions are little more than trying to find a good way to pile up our pebbles. It probably doesn’t matter that the pattern we’re currently on to is “bigger and bigger piles of primes”, since pebble-sorting isn’t certain at all to be the right mountain to be climbing. An FAI might be able to convince us that 108301 is a good pile from within our own paradigm, but how can it ever convince us that we have the wrong paradigm altogether, especially if that appears counter to our own interests?
What if Eliezer were to suddenly find himself alone among neanderthals? Knowing, with his advanced knowledge and intelligence, that neanderthals were doomed to extinction, would he be immoral or unfriendly to continue to devote his efforts to developing greater and greater intelligences, instead of trying to find a way to sustain the neanderthal paradigm for its own sake? Similarly, why should we try to restrain future AGI so that it maintains the human paradigm?
The obvious answer is that we want to stay alive, and we don’t want our atoms used for other things. But why does it matter what we want, if we aren’t ever able to know if what we want is correct for the universe at large? What if our only purpose is to simply enable the next stage of intelligence, then to disappear into the past? It seems more rational to me to abandon focus specifically on FAI, and just build AGI as quickly as possible before humanity destroys itself.
Isn’t the true mark of rationality the ability to reach a correct conclusion even if you don’t like the answer?
But why does it matter what we want, if we aren’t ever able to know if what we want is correct for the universe at large?
There is no sense in which what we want may be correct or incorrect for the universe at large, because the universe does not care. Caring is a thing that minds do, and the universe is not a mind.
What if our only purpose is to simply enable the next stage of intelligence, then to disappear into the past?
Our purpose is whatever we choose it to be; purposes are goals seen from another angle. There is no source of purposefulness outside the universe. My goals require that humans stick around, so our purpose with respect to my goal system does not involve disappearing into the past. I think most peoples’ goal systems are similar.
There is no sense in which what we want may be correct or incorrect for the universe at large, because the universe does not care. Caring is a thing that minds do, and the universe is not a mind.
Yes, I agree, and I realize that that isn’t what I was actually trying to say. What I meant was, there is a set of possible, superlatively rational intelligences that may make better use of the universe than humanity (or humanity + a constrained FAI). If Omega reveals to you that such an intelligence would come about if you implement AGI with no Friendly constraint, at the cost of the extinction of humanity, would you build it? This to me drives directly to the heart of whether you value rationality over existence. You don’t personally ‘win’, humanity doesn’t ‘win’, but rationality is maximized.
My goals require that humans stick around, so our purpose with respect to my goal system does not involve disappearing into the past. I think most peoples’ goal systems are similar.
I think we need to unpack that a little, because I don’t think you mean “humans stick around more or less unchanged from their current state”. This is what I was trying to drive at about the Neanderthals. In some sense we ARE Neanderthals, slightly farther along an evolutionary timescale, but you wouldn’t likely feel any moral qualms about their extinction.
So if you do expect that humanity will continue to evolve, probably into something unrecognizable to 21st century humans, in what sense does humanity actually “stick around”? Do you mean you, personally, want to maintain your own conscious self indefinitely, so that no matter what the future, “you” will in some sense be part of it? Or do you mean “whatever intelligent life exists in the future, its ancestry is strictly human”?
“Better” is defined by us. This is the point of the metaethics sequence! A universe tiled with paperclips is not better than what we have now. Rationality is not something one values, it’s someone ones uses to get what they value.
You seem to be imagining FAI as some kind of anthropomorphic intelligence with some sort of “constraint” that says “make sure biological humans continue to exist”. This is exactly the wrong way to implement FAI. The point of FAI is simply for the AI to do what is right (as opposed to what is prime, or paperclip-maximising). In EY’s plan, this involves the AI looking at human minds to discover what we mean by right first.
Now, the right thing may not involve keeping 21st century humanity around forever. Some people will want to be uploaded. Some people will just want better bodies. And yes, most of us will want to “live forever”. But the right thing is definitely not to immediately exterminate the entire population of earth.
This to me drives directly to the heart of whether you value rationality over existence. You don’t personally ‘win’, humanity doesn’t ‘win’, but rationality is maximized.
Why should I value rationality if it results in me losing everything I care about? What is the virtue, to us, of someone else’s rationality?
I think it’s more apt to characterize winning as a goal of rationality, not as its mark.
In Bayesian terms, while those applying the methods of rationality should win more than the general population on average—p(winning|rationalist) > p(winning|non-rationalist)-- the number of rationalists in the population is low enough at present that p(non-rationalist|winning) almost certainly > p(rationalist|winning), so observing whether or not someone is winning is not very good evidence as to their rationality.
Ack, you’re entirely right. “Mark” is somewhat ambiguous to me without context, I think I had imbued it with some measure of goalness from the GP’s use.
I have a bad habit of uncritically imitating peoples’ word choices within the scope of a conversation. In this case, it bit me by echoing the GP’s is-ought confusion… yikes!
I wonder about the time scale for winning. After all, a poker player using an optimal strategy can still expect extended periods of losing, and poker is better defined than a lot of life situations.
It gets interesting when the pebblesorters turn on a correctly functioning FAI, which starts telling them that they should build a pile of 108301 and legislative bodies spend the next decade debating whether or not it is in fact a correct pile. “How does this AI know better anyway? That looks new and strange.” “That doesn’t sound correct to me at all. You’d have to be crazy to build 108301. It’s so different from 2029! It’s a slippery slope to 256!” And so on.
This really is a fantastic parable—it shows off perhaps a dozen different aspects of the forrest we were missing for the trees.
When I read this parable, I was already looking for a reason to understand why Friendly AI necessarily meant “friendly to human interests or with respect to human moral systems”. Hence, my conclusion from this parable was that Eliezer was trying to show how, from the perspective of AGI, human goals and ambitions are little more than trying to find a good way to pile up our pebbles. It probably doesn’t matter that the pattern we’re currently on to is “bigger and bigger piles of primes”, since pebble-sorting isn’t certain at all to be the right mountain to be climbing. An FAI might be able to convince us that 108301 is a good pile from within our own paradigm, but how can it ever convince us that we have the wrong paradigm altogether, especially if that appears counter to our own interests?
What if Eliezer were to suddenly find himself alone among neanderthals? Knowing, with his advanced knowledge and intelligence, that neanderthals were doomed to extinction, would he be immoral or unfriendly to continue to devote his efforts to developing greater and greater intelligences, instead of trying to find a way to sustain the neanderthal paradigm for its own sake? Similarly, why should we try to restrain future AGI so that it maintains the human paradigm?
The obvious answer is that we want to stay alive, and we don’t want our atoms used for other things. But why does it matter what we want, if we aren’t ever able to know if what we want is correct for the universe at large? What if our only purpose is to simply enable the next stage of intelligence, then to disappear into the past? It seems more rational to me to abandon focus specifically on FAI, and just build AGI as quickly as possible before humanity destroys itself.
Isn’t the true mark of rationality the ability to reach a correct conclusion even if you don’t like the answer?
There is no sense in which what we want may be correct or incorrect for the universe at large, because the universe does not care. Caring is a thing that minds do, and the universe is not a mind.
Our purpose is whatever we choose it to be; purposes are goals seen from another angle. There is no source of purposefulness outside the universe. My goals require that humans stick around, so our purpose with respect to my goal system does not involve disappearing into the past. I think most peoples’ goal systems are similar.
Yes, I agree, and I realize that that isn’t what I was actually trying to say. What I meant was, there is a set of possible, superlatively rational intelligences that may make better use of the universe than humanity (or humanity + a constrained FAI). If Omega reveals to you that such an intelligence would come about if you implement AGI with no Friendly constraint, at the cost of the extinction of humanity, would you build it? This to me drives directly to the heart of whether you value rationality over existence. You don’t personally ‘win’, humanity doesn’t ‘win’, but rationality is maximized.
I think we need to unpack that a little, because I don’t think you mean “humans stick around more or less unchanged from their current state”. This is what I was trying to drive at about the Neanderthals. In some sense we ARE Neanderthals, slightly farther along an evolutionary timescale, but you wouldn’t likely feel any moral qualms about their extinction.
So if you do expect that humanity will continue to evolve, probably into something unrecognizable to 21st century humans, in what sense does humanity actually “stick around”? Do you mean you, personally, want to maintain your own conscious self indefinitely, so that no matter what the future, “you” will in some sense be part of it? Or do you mean “whatever intelligent life exists in the future, its ancestry is strictly human”?
“Better” is defined by us. This is the point of the metaethics sequence! A universe tiled with paperclips is not better than what we have now. Rationality is not something one values, it’s someone ones uses to get what they value.
You seem to be imagining FAI as some kind of anthropomorphic intelligence with some sort of “constraint” that says “make sure biological humans continue to exist”. This is exactly the wrong way to implement FAI. The point of FAI is simply for the AI to do what is right (as opposed to what is prime, or paperclip-maximising). In EY’s plan, this involves the AI looking at human minds to discover what we mean by right first.
Now, the right thing may not involve keeping 21st century humanity around forever. Some people will want to be uploaded. Some people will just want better bodies. And yes, most of us will want to “live forever”. But the right thing is definitely not to immediately exterminate the entire population of earth.
‘Better’ by what standard?
Why should I value rationality if it results in me losing everything I care about? What is the virtue, to us, of someone else’s rationality?
Winning is a truer mark of rationality.
I think it’s more apt to characterize winning as a goal of rationality, not as its mark.
In Bayesian terms, while those applying the methods of rationality should win more than the general population on average—p(winning|rationalist) > p(winning|non-rationalist)-- the number of rationalists in the population is low enough at present that p(non-rationalist|winning) almost certainly > p(rationalist|winning), so observing whether or not someone is winning is not very good evidence as to their rationality.
Ack, you’re entirely right. “Mark” is somewhat ambiguous to me without context, I think I had imbued it with some measure of goalness from the GP’s use.
I have a bad habit of uncritically imitating peoples’ word choices within the scope of a conversation. In this case, it bit me by echoing the GP’s is-ought confusion… yikes!
I wonder about the time scale for winning. After all, a poker player using an optimal strategy can still expect extended periods of losing, and poker is better defined than a lot of life situations.