P1 is mistaken—it doesn’t matter how slow humans are at AGI research, because as soon as we get something that can recursively self-improve, it won’t be humans doing the research anymore.
Also, P2, while correct in spirit, is incorrect as stated—there might be some improvements that are very expensive, and give no benefit, apart from allowing other, more effective improvements in the future.
And I don’t understand why P5 is there—even if they were requirements, someone could just code a stable, self-preserving paperclipper, and it’d be just as unfriendly.
...it doesn’t matter how slow humans are at AGI research...
It matters, because if we only go as far as the moon, if you forgive me the space exploration metaphor, and then need thousands of years to reach the next star system, humans will adapt to cope with the long journey.
...apart from allowing other, more effective improvements in the future.
How can you tell in advance which expensive improvement will turn out to be a crane? Rational decision makers won’t invest a lot of resources, just in case doing so might turn out to be useful in future.
...even if they were requirements, someone could just code a stable, self-preserving paperclipper, and it’d be just as unfriendly.
Yes, but if someone is smart enough to recognize that self-improving agents need stable utility-functions and have to be friendly with respect to the values of their lower-level-self’s, then it is very unlikely that the same person fails to recognize the need for human-friendliness.
It matters, because if we only go as far as the moon, if you forgive me the space exploration metaphor, and then need thousands of years to reach the next star system, humans will adapt to cope with the long journey.
I don’t understand your metaphor. Do you mean that if an AI recursively improves slowly, the changes it causes in the world won’t seem fast to us?
If that’s the correct interpretation, it’s true, but may not be very likely if the AI is run at many times human speed.
if someone is smart enough to recognize that self-improving agents need stable utility-functions and have to be friendly with respect to the values of their lower-level-self’s, then it is very unlikely that the same person fails to recognize the need for human-friendliness.
I’m not disputing the above statement itself, but it implies that you are counting a friendly AI recursively improving to superintelligence quickly as not being a FOOM. If you call a friendly FOOM a FOOM, then P6 is irrelevant.
...it doesn’t matter how slow humans are at AGI research...
It matters, because if we only go as far as the moon, if you forgive me the space exploration metaphor, and then need thousands of years to reach the next star system, humans will adapt to cope with the long journey.
It seems pretty challenging to envisage humans “adapting” to the existence of superintelligent machines on any realistic timescale—unless you mean finding a way to upload their essences into cyberspace.
It looks like he meant something like, “if it takes 10,000 years to get to AI, then other changes like biological modification, singleton formation, cultural/values drift, stochastic risk of civilization-collapsing war, etc, are the most important areas for affecting humanity’s future.”
It matters, because if we only go as far as the moon, if you forgive me the space exploration metaphor, and then need thousands of years to reach the next star system, humans will adapt to cope with the long journey.
We’ll have thousands of years to adapt to the journey, but events might unfold very quickly once we got there.
Rational decision makers won’t invest a lot of resources, just in case doing so might turn out to be useful in future.
I can’t see any obveous reason why the expected value couldn’t be positive?
P1 is mistaken—it doesn’t matter how slow humans are at AGI research, because as soon as we get something that can recursively self-improve, it won’t be humans doing the research anymore.
Also, P2, while correct in spirit, is incorrect as stated—there might be some improvements that are very expensive, and give no benefit, apart from allowing other, more effective improvements in the future.
And I don’t understand why P5 is there—even if they were requirements, someone could just code a stable, self-preserving paperclipper, and it’d be just as unfriendly.
It matters, because if we only go as far as the moon, if you forgive me the space exploration metaphor, and then need thousands of years to reach the next star system, humans will adapt to cope with the long journey.
How can you tell in advance which expensive improvement will turn out to be a crane? Rational decision makers won’t invest a lot of resources, just in case doing so might turn out to be useful in future.
Yes, but if someone is smart enough to recognize that self-improving agents need stable utility-functions and have to be friendly with respect to the values of their lower-level-self’s, then it is very unlikely that the same person fails to recognize the need for human-friendliness.
I don’t understand your metaphor. Do you mean that if an AI recursively improves slowly, the changes it causes in the world won’t seem fast to us?
If that’s the correct interpretation, it’s true, but may not be very likely if the AI is run at many times human speed.
I’m not disputing the above statement itself, but it implies that you are counting a friendly AI recursively improving to superintelligence quickly as not being a FOOM. If you call a friendly FOOM a FOOM, then P6 is irrelevant.
It seems pretty challenging to envisage humans “adapting” to the existence of superintelligent machines on any realistic timescale—unless you mean finding a way to upload their essences into cyberspace.
It looks like he meant something like, “if it takes 10,000 years to get to AI, then other changes like biological modification, singleton formation, cultural/values drift, stochastic risk of civilization-collapsing war, etc, are the most important areas for affecting humanity’s future.”
We’ll have thousands of years to adapt to the journey, but events might unfold very quickly once we got there.
I can’t see any obveous reason why the expected value couldn’t be positive?