Is AI risk assessment too anthropocentric?
Hi everyone,
I’ve recently discovered lesswrong and love it! So first, let me thank you all for fostering such a wonderful community.
I’ve been reading a lot of the AI material and I find myself asking a question you all have surely considered, so I wanted to pose it.
If I believe that human beings are evolutionarily descended from apes, and I ask myself whether apes—if they had control over allowing human evolution to happen—should have allowed it or stopped it, I’m honestly not sure what the answer should be.
It seems like apes would in all likelihood be better off without humans around, so from the perspective of apes, they should probably have not allowed it to happen. However, looked at from a different frame of reference, like maybe what is good for the earth, or for the universe, then maybe the evolution of humans from apes was a good thing. Certainly from the perspective of humans, most of us would believe that it was allowed to happen was a good thing.
Do we find ourselves in a similar scenario with humans and AI? Are there benefits from other frames of reference besides humanity to allow the development of AI, even if that AI may pose existential threats to human civilization? And if so, are those perspectives being taken into full enough account when we think about AI risk assessment?
Have you seen the Value is Fragile post? It might be helpful in addressing your question.
Thank you for this link and also for your through response “the gears to ascension” I have some reading to do! Including “Overcoming Bias” which I am interested in as the basis for the answer in Value is Fragile.
One of the first points Eliezer makes in “Value is Fragile” is that we almost certainly create paperclip generators if we take our hand of the steering wheel. One of the things that is special about humans is that some claim that the human brain is the most complex structure in the universe—i.e. the opposite of entropy. Is the pursuit of complexity itself a goal (an “alignment”?) that by definition protects against entropy?
I grant that this may be a naïve thought, but I wonder—if things that are not paperclip generators are so hard to come by—how humans and all of the other complex structures that we know of in the universe arose at all....
I do think so, yeah. Here are some links to some stuff I and others have said; It’s taking me enough time to choose this list of links that I’m not going to spend much time summarizing them unless your reply indicates that you’re stuck on understanding them. But feel free to say so if that’s the case and I’ll see what I can do. Note that for all of these, I browsed and skimmed enough to feel that I was linking something relevant, but consider these to be human-grade semantic search results (I used several semantic search engines as seed, actually), so not necessarily guaranteed to be what you seek.
https://www.lesswrong.com/posts/AGCLZPqtosnd82DmR/call-for-submissions-in-human-values-and-artificial-agency
https://www.lesswrong.com/posts/BzYmJYECAc3xyCTt6/the-plan-2022-update
https://www.lesswrong.com/posts/TrvkWBwYvvJjSqSCj/a-broad-basin-of-attraction-around-human-values
https://www.lesswrong.com/posts/T4Lfw2HZQNFjNX8Ya/have-we-really-forsaken-natural-selection
https://www.lesswrong.com/posts/hti2q9AA8efceAoAb/nature-abhors-an-immutable-replicator-usually
https://www.lesswrong.com/posts/RwYh4grJs4pbJdTh3/motivations-natural-selection-and-curriculum-engineering
https://www.lesswrong.com/posts/LiDKveFnmL59u5Ceq/how-might-an-alignment-attractor-look-like
https://www.lesswrong.com/posts/hti2q9AA8efceAoAb/nature-abhors-an-immutable-replicator-usually
https://www.lesswrong.com/posts/BuaFZud9BwkiSCGpd/alignment-might-never-be-solved-by-humans-or-ai
Some comments where I claim without evidence things that seem related to your question
https://www.lesswrong.com/posts/NJYmovr9ZZAyyTBwM/what-i-mean-by-alignment-is-in-large-part-about-making?commentId=JuuuBH9MCRbtixerY
https://www.lesswrong.com/posts/jwETb9nZcSYKCQvFo/aligned-with-what?commentId=AznCPGdhrDWdPkXxJ
https://www.lesswrong.com/posts/rKmojEZ9qKwApjCfX/the-gears-to-ascenscion-s-shortform?commentId=NCwmPwfDXFnxaY5J8
https://www.lesswrong.com/posts/rKmojEZ9qKwApjCfX/the-gears-to-ascenscion-s-shortform?commentId=DxdTmpSzCuyo5PbDb
https://www.lesswrong.com/posts/evtJJeghGM5aAM5W7/will-research-in-ai-risk-jinx-it-consequences-of-training-ai?commentId=HyxBJHpXp3pgf4DN2
This question is incoherent in my ontology.
Morality wrt the perspective of the earth or universe is undefined.
The evolution of humans from apes was good by human values and perhaps bad by ape values.
It has no independent moral value.
AI takeover would be bad by human values.
Strongly disagree—we’re still apes, and as long as we survive, we are a good outcome for the values of our ancestors. Our competing species cousins are probably quite salty at us, but we’ve done well by the standards of our pre-human forbears.
Would the pre-human apes as a class, if somehow given enough intelligence to understand the question, have endorsed their own great^n-grandchildren from developing human intelligence? Yes, I think so. I’m pretty sure that if you extrapolated their values to more capable intelligence, it would generally include greater capability of their own descendants in many ways.
Would they endorse what we have done with it? I’m not even sure whether the question is meaningful, since it depends much more strongly upon what extrapolated values and reasoning processes you sneak in to the “somehow given enough intelligence”. There may be plenty of other aspects of human biological and cultural development that they may not have endorsed.
Would animals outside the human roots of ancestry have endorsed any of it at all? I’d guess very likely not, considering what’s happening to most of their descendants. Looked at from the point of view of “good for the Earth”, I’m not even sure what the question means. The Earth itself is a huge ball of inanimate rocks. From the point of view of most species on it, the development of human intelligence has been a catastrophic mass of extinctions. Have we even affected the Universe beyond Earth in any salient manner? What sort of thing qualifies as good (or bad) for the universe anyway?
One thing does seem fairly likely: my belief is that in the event that AI does lead to human extinction, then it would very likely also lead to the extinction of every other form of life on Earth and probably a long way beyond. This makes the question much less human-centric, and seems to address many of your questions.
Possibly the only being(s) that would think it a good thing would be the AI itself/themselves—and it’s not even known whether they will have any values. Even if they do, we can have no idea what they will be. It’s possible to imagine an entity that experiences its own existence with loathing and/or suffering in other ways, but can’t or won’t suicide regardless of that for other reasons.