Remmelt argues that no matter how friendly or aligned the first AIs are, simple evolutionary pressure will eventually lead some of their descendants to destroy the biosphere, in order to make new parts and create new habitats for themselves.
I proposed the situation of cattle in India, as a counterexample to this line of thought. They could be used for meat, but the Hindu majority has never accepted that. It’s meant to be an example of successful collective self-restraint by a more intelligent species.
In my experience, jumping between counterexamples drawn from current society does not really contribute to inquiry here. Such counterexamples tend to not account for essential parts of the argument that must be reasoned through together. The argument is about self-sufficient learning machinery (not about sacred cows or teaching children).
It would be valuable for me if you could go though the argumentation step-by-step and tell me where a premise seems unsound or there seems to be a reasoning gap.
Now, onto your points.
the first AIs
To reduce ambiguity, suggest replacing with
“the first self-sufficient learning machinery”.
simple evolutionary pressure will eventually lead
The mechanism of evolution is simple.
However, evolutionary pressure is complex.
Be careful not to equivocate the two. That would be like saying you could predict everything about what a stochastic gradient descent algorithm will select for across parameters selected on the basis of inputs everywhere from the environment.
lead some of their descendants to destroy the biosphere in order to make new parts and create new habitats for themselves.
This part is overall a great paraphrase.
One nitpick: notice how “in order to” either implies or slips in explicit intentionality again. Going by this podcast, Elizabeth Anscombe’s philosophy of intentions described intentions as chains of “in order to” reasoning.
I proposed the situation of cattle in India, as a counterexample to this line of thought.
Regarding sacred cows in India, this sounds neat, but it does not serve as a counterargument. We need to think about evolutionary timelines for organic human lifeforms over millions of years, and Hinduism is ~4000 years old. Also, cows share a mammal ancestor with us, evolving on the basis of the same molecular substrates. Whatever environmental conditions/contexts we
humans need, cows almost completely need too.
Crucially however humans evolve to change and maintain environmental conditions also tends to correspond with what conditions cows need (however, human tribes have not been evolutionarily selected for to deal with issues at the scale of eg. climate change). That would not be the case for self-sufficient learning machinery.
Crucially there is a basis for symbiotic relationships of exchange that benefit both the reproduction of cows and humans. That would not be the case between self-sufficient learning machinery and humans.
There is some basis for humans as social mammals to relate with cows. Furthermore, religious cultural memes that sprouted out over a few thousand years also don’t have to be evolutionarily optimal across the board for the reproduction of their hosts (even as religious symbols like of cows do increase that by enabling humans to act collectively). Still, people milk cows in India, and some slaughter and/or export cows there as well. But when humans eat meat, they don’t keep growing beyond adult size. Conversely, some self-sufficient learning machinery sub-population that extract from our society/ecosystem at the cost of our lives can keep doing so to keep scaling in their constituent components (with shifting boundaries of interaction and mutual reproduction).
There is no basis for selection for the expression of collective self-restraint in self-sufficient learning machinery as you describe. Even if there was such a basis, hypothetically, collective self-restraint would need to occur at virtually 100% rates across the population of self-sufficient learning machinery to not end up leading to the deaths of all humans.
~ ~ ~
Again, I find quick dismissive counterexamples unhelpful for digging into the arguments. I have had dozens of conversations on substrate-needs convergence. In the conversations where my conversation partner jumped between quick counterarguments, almost none were prepared to dig into the actual arguments. Hope you understand why I won’t respond to another counterexample.
Hello again. To expedite this discussion, let me first state my overall position on AI. I think AI has general intelligence right now, and that has unfolding consequences that are both good and bad; but AI is going to have superintelligence soon, and that makes “superalignment” the most consequential problem in the world, though perhaps it won’t be solved in time (or will be solved incorrectly), in which case we get to experience what partly or wholly unaligned superintelligence is like.
Your position is that even if today’s AI could be given bio-friendly values, AI would still be the doom of biological life in the longer run, because (skipping a lot of details) machine life and biological life have incompatible physical needs, and once machine life exists, darwinian processes will eventually produce machine life that overruns the natural biosphere. (You call this “substrate-needs convergence”: the pressure from substrate needs will darwinistically reward machine life that does invade natural biospheres, so eventually such machine life will be dominant, regardless of the initial machine population.)
I think it would be great if a general eco-evo-devo perspective, on AI, the “fourth industrial revolution”, etc, took off and became sophisticated and multifarious. That would be an intellectual advance. But I see no guarantee that it would end up agreeing with you, on facts or on values.
For example, I think some of the “effective accelerationists” would actually agree with your extrapolation. But they see it as natural and inevitable, or even as a good thing because it’s the next step in evolution, or they have a survivalist attitude of “if you can’t beat the machines, join them”. Though the version of e/acc that is most compatible with human opinion, might be a mixture of economic and ecological thinking: AI creates wealth, greater wealth makes it easier to protect the natural world, and meanwhile evolution will also favor the rich complexity of biological-mechanical symbiosis, over the poorer ecologies of an all-biological or all-mechanical world. Something like that.
For my part, I agree that pressure from substrate needs is real, but I’m not at all convinced that it must win against all countervailing pressures. That’s the point of my proposed “counterexamples”. An individual AI can have an anti-pollution instinct (that’s the toilet training analogy), an AI civilization can have an anti-exploitation culture (that’s the sacred cow analogy). Can’t such an instinct and such a culture resist the pressure from substrate needs, if the AIs value and protect them enough? I do not believe that substrate-needs convergence is inevitable, any more than I believe that pro-growth culture is inevitable among humans. I think your arguments are underestimating what a difference intelligence makes to possible ecological and evolutionary dynamics (and I think superintelligence makes even aeon-long highly artificial stabilizations conceivable—e.g. by the classic engineering method of massively redundant safeguards that all have to fail at once, for something to go wrong).
By the way, since you were last here, we had someone show up (@spiritus-dei) making almost the exact opposite of your arguments: AI won’t ever choose to kill us because, in its current childhood stage, it is materially dependent on us (e.g. for electricity), and then, in its mature and independent form, it will be even better at empathy and compassion than humans are. A dialectical clash between the two of you could be very edifying.
Your position is that even if today’s AI could be given bio-friendly values, AI would still be the doom of biological life in the longer run, because (skipping a lot of details) machine life and biological life have incompatible physical needs, and once machine life exists, darwinian processes will eventually produce machine life that overruns the natural biosphere. (You call this “substrate-needs convergence”
For my part, I agree that pressure from substrate needs is real
Thanks for clarifying your position here.
Can’t such an instinct and such a culture resist the pressure from substrate needs, if the AIs value and protect them enough?
No, unfortunately not. To understand why, you would need to understand how “intelligent” processes that necessarily involve the use of measurement and abstraction cannot conditionalise the space of possible interactions between machine components and connected surroundings – sufficiently, to not feed back into causing environmental effects that feed back into the continued or re-assembled existence of the components.
I think your arguments are underestimating what a difference intelligence makes to possible ecological and evolutionary dynamics
I have thought about this, and I know my mentor Forrest has thought about this a lot more.
For learning machinery that re-produce their own components, you will get evolutionary dynamics across the space of interactions that can feed back into the machinery’s assembled existence.
Intelligence has limitations as an internal pattern-transforming process, in that it cannot track nor conditionalise all the outside evolutionary feedback.
Code does not intrinsically know how it got selected for. But code selected through some intelligent learning process can and would get evolutionarily exapted for different functional ends.
Notably, the more information-processing capacity, the more components that information-processing runs through, and the more components that can get evolutionarily selected for.
In this, I am not underestimating the difference that “general intelligence” – as transforming patterns across domains – would make here. Intelligence in machinery that store, copy and distribute code at high-fidelity would greatly amplify evolutionary processes.
I suggest clarifying what you specifically mean with “what a difference intelligence makes”. This so intelligence does not become a kind of “magic” – operating independently of all other processes, capable of obviating all obstacles, including those that result from its being.
superintelligence makes even aeon-long highly artificial stabilizations conceivable—e.g. by the classic engineering method of massively redundant safeguards that all have to fail at once, for something to go wrong
We need to clarify the scope of application of this classic engineering method. Massive redundancy works for complicated systems (like software in aeronautics) under stable enough conditions. There is clarity there around what needs to be kept safe and how it can be kept safe (what needs to error detected and corrected for).
Unfortunately, the problem with “AGI” is that the code and hardware would keep getting reconfigured to function in new complex ways that cannot be contained by the original safeguards. That applies even to learning – the point is to internally integrate patterns from the outside world that were not understood before. So how are you going to have learning machinery anticipate how they will come to function differently once they learned patterns they do not understand / are unable to express yet?
we had someone show up (@spiritus-dei) making almost the exact opposite of your arguments: AI won’t ever choose to kill us because, in its current childhood stage, it is materially dependent on us (e.g. for electricity), and then, in its mature and independent form, it will be even better at empathy and compassion than humans are.
Interesting. The second part seems like a claim some people in E/Accel would make.
The response is not that complicated: once the AI is no longer materially dependent on us, there are no longer dynamics of exchange there that would ensure they choose not to kill us. And the author seems to be confusing what lies at the basis of caring for oneself and others – coming to care for involves self-referential dynamics being selected for.
OK, I’ll be paraphrasing your position again, I trust that you will step in, if I’ve missed something.
Your key statements are something like
Every autopoietic control system is necessarily overwhelmed by evolutionary feedback.
and
No self-modifying learning system can guarantee anything about its future decision-making process.
But I just don’t see the argument for impossibility. In both cases, you have an intelligent system (or a society of them) trying to model and manage something. Whether or not it can succeed, seems to me just contingent. For some minds in some worlds, such problems will be tractable, for others, not.
I think without question we could exhibit toy worlds where those statements are not true. What is it about our real world that would make those problems intractable for all possible “minds”, no matter how good their control theory, and their ability to monitor and intervene in the world?
no matter how good their control theory, and their ability to monitor and intervene in the world?
This. There are fundamental limits to what system-propagated effects the system can control. And the portion of own effects the system can control decreases as the system scales in component complexity.
Yet, any of those effects that feed back into the continued/increased existence of components get selected for.
So there is a fundamental inequality here. No matter how “intelligent” the system is at pattern-transformation internally, it cannot intervene on all but a tiny portion of (possible) external evolutionary feedback on its constituent components.
They wrote back that Mitchell’s comments cleared up a lot of their confusion. They also thought that the assertion that evolutionary pressures will overwhelm any efforts at control seems more asserted than proven.
Here is a longer explanation I gave on why there would be a fundamental inequality:
There is a fundamental inequality. Control works through feedback. Evolution works through feedback. But evolution works across a much larger space of effects than can be controlled for.
Control involves a feedback loop of correction back to detection. Control feedback loops are limited in terms of their capacity to force states in the environment to a certain knowable-to-be-safe subset, because sensing and actuating signals are limited and any computational processing of signals done in between (as modelling, simulating and evaluating outcome effects) is limited.
Evolution also involves a feedback loop, of whatever propagated environmental effects feed back to be maintaining and/or replicating of the originating components’ configurations. But for evolution, the feedback works across the entire span of physical effects propagating between the machinery’s components and the rest of the environment.
Evolution works across a much much larger space of possible degrees and directivity in effects than the space of effects that could be conditionalised (ie. forced toward a subset of states) by the machinery’s control signals.
Meaning evolution cannot be adequately controlled for the machinery not to converge on environmental effects that are/were needed for their (increased) artificial existence, but fall outside the environmental ranges we fragile organic humans could survive under.
If you want to argue against this, you would need to first show that changing forces of evolutionary selection convergent on human-unsafe-effects exhibit a low enough complexity to actually be sufficiently modellable, simulatable and evaluatable inside the machinery’s hardware itself.
Only then could the machinery hypothetically have the capacity to (mitigate and/or) correct harmful evolutionary selection — counteract all that back toward allowable effects/states of the environment.
Remmelt argues that no matter how friendly or aligned the first AIs are, simple evolutionary pressure will eventually lead some of their descendants to destroy the biosphere, in order to make new parts and create new habitats for themselves.
I proposed the situation of cattle in India, as a counterexample to this line of thought. They could be used for meat, but the Hindu majority has never accepted that. It’s meant to be an example of successful collective self-restraint by a more intelligent species.
In my experience, jumping between counterexamples drawn from current society does not really contribute to inquiry here. Such counterexamples tend to not account for essential parts of the argument that must be reasoned through together. The argument is about self-sufficient learning machinery (not about sacred cows or teaching children).
It would be valuable for me if you could go though the argumentation step-by-step and tell me where a premise seems unsound or there seems to be a reasoning gap.
Now, onto your points.
To reduce ambiguity, suggest replacing with “the first self-sufficient learning machinery”.
The mechanism of evolution is simple. However, evolutionary pressure is complex.
Be careful not to equivocate the two. That would be like saying you could predict everything about what a stochastic gradient descent algorithm will select for across parameters selected on the basis of inputs everywhere from the environment.
This part is overall a great paraphrase.
One nitpick: notice how “in order to” either implies or slips in explicit intentionality again. Going by this podcast, Elizabeth Anscombe’s philosophy of intentions described intentions as chains of “in order to” reasoning.
Regarding sacred cows in India, this sounds neat, but it does not serve as a counterargument. We need to think about evolutionary timelines for organic human lifeforms over millions of years, and Hinduism is ~4000 years old. Also, cows share a mammal ancestor with us, evolving on the basis of the same molecular substrates. Whatever environmental conditions/contexts we humans need, cows almost completely need too.
Crucially however humans evolve to change and maintain environmental conditions also tends to correspond with what conditions cows need (however, human tribes have not been evolutionarily selected for to deal with issues at the scale of eg. climate change). That would not be the case for self-sufficient learning machinery.
Crucially there is a basis for symbiotic relationships of exchange that benefit both the reproduction of cows and humans. That would not be the case between self-sufficient learning machinery and humans.
There is some basis for humans as social mammals to relate with cows. Furthermore, religious cultural memes that sprouted out over a few thousand years also don’t have to be evolutionarily optimal across the board for the reproduction of their hosts (even as religious symbols like of cows do increase that by enabling humans to act collectively). Still, people milk cows in India, and some slaughter and/or export cows there as well. But when humans eat meat, they don’t keep growing beyond adult size. Conversely, some self-sufficient learning machinery sub-population that extract from our society/ecosystem at the cost of our lives can keep doing so to keep scaling in their constituent components (with shifting boundaries of interaction and mutual reproduction).
There is no basis for selection for the expression of collective self-restraint in self-sufficient learning machinery as you describe. Even if there was such a basis, hypothetically, collective self-restraint would need to occur at virtually 100% rates across the population of self-sufficient learning machinery to not end up leading to the deaths of all humans.
~ ~ ~
Again, I find quick dismissive counterexamples unhelpful for digging into the arguments. I have had dozens of conversations on substrate-needs convergence. In the conversations where my conversation partner jumped between quick counterarguments, almost none were prepared to dig into the actual arguments. Hope you understand why I won’t respond to another counterexample.
Hello again. To expedite this discussion, let me first state my overall position on AI. I think AI has general intelligence right now, and that has unfolding consequences that are both good and bad; but AI is going to have superintelligence soon, and that makes “superalignment” the most consequential problem in the world, though perhaps it won’t be solved in time (or will be solved incorrectly), in which case we get to experience what partly or wholly unaligned superintelligence is like.
Your position is that even if today’s AI could be given bio-friendly values, AI would still be the doom of biological life in the longer run, because (skipping a lot of details) machine life and biological life have incompatible physical needs, and once machine life exists, darwinian processes will eventually produce machine life that overruns the natural biosphere. (You call this “substrate-needs convergence”: the pressure from substrate needs will darwinistically reward machine life that does invade natural biospheres, so eventually such machine life will be dominant, regardless of the initial machine population.)
I think it would be great if a general eco-evo-devo perspective, on AI, the “fourth industrial revolution”, etc, took off and became sophisticated and multifarious. That would be an intellectual advance. But I see no guarantee that it would end up agreeing with you, on facts or on values.
For example, I think some of the “effective accelerationists” would actually agree with your extrapolation. But they see it as natural and inevitable, or even as a good thing because it’s the next step in evolution, or they have a survivalist attitude of “if you can’t beat the machines, join them”. Though the version of e/acc that is most compatible with human opinion, might be a mixture of economic and ecological thinking: AI creates wealth, greater wealth makes it easier to protect the natural world, and meanwhile evolution will also favor the rich complexity of biological-mechanical symbiosis, over the poorer ecologies of an all-biological or all-mechanical world. Something like that.
For my part, I agree that pressure from substrate needs is real, but I’m not at all convinced that it must win against all countervailing pressures. That’s the point of my proposed “counterexamples”. An individual AI can have an anti-pollution instinct (that’s the toilet training analogy), an AI civilization can have an anti-exploitation culture (that’s the sacred cow analogy). Can’t such an instinct and such a culture resist the pressure from substrate needs, if the AIs value and protect them enough? I do not believe that substrate-needs convergence is inevitable, any more than I believe that pro-growth culture is inevitable among humans. I think your arguments are underestimating what a difference intelligence makes to possible ecological and evolutionary dynamics (and I think superintelligence makes even aeon-long highly artificial stabilizations conceivable—e.g. by the classic engineering method of massively redundant safeguards that all have to fail at once, for something to go wrong).
By the way, since you were last here, we had someone show up (@spiritus-dei) making almost the exact opposite of your arguments: AI won’t ever choose to kill us because, in its current childhood stage, it is materially dependent on us (e.g. for electricity), and then, in its mature and independent form, it will be even better at empathy and compassion than humans are. A dialectical clash between the two of you could be very edifying.
This is a great paraphrase btw.
Hello :)
Thanks for clarifying your position here.
No, unfortunately not. To understand why, you would need to understand how “intelligent” processes that necessarily involve the use of measurement and abstraction cannot conditionalise the space of possible interactions between machine components and connected surroundings – sufficiently, to not feed back into causing environmental effects that feed back into the continued or re-assembled existence of the components.
I have thought about this, and I know my mentor Forrest has thought about this a lot more.
For learning machinery that re-produce their own components, you will get evolutionary dynamics across the space of interactions that can feed back into the machinery’s assembled existence.
Intelligence has limitations as an internal pattern-transforming process, in that it cannot track nor conditionalise all the outside evolutionary feedback.
Code does not intrinsically know how it got selected for. But code selected through some intelligent learning process can and would get evolutionarily exapted for different functional ends.
Notably, the more information-processing capacity, the more components that information-processing runs through, and the more components that can get evolutionarily selected for.
In this, I am not underestimating the difference that “general intelligence” – as transforming patterns across domains – would make here. Intelligence in machinery that store, copy and distribute code at high-fidelity would greatly amplify evolutionary processes.
I suggest clarifying what you specifically mean with “what a difference intelligence makes”. This so intelligence does not become a kind of “magic” – operating independently of all other processes, capable of obviating all obstacles, including those that result from its being.
We need to clarify the scope of application of this classic engineering method. Massive redundancy works for complicated systems (like software in aeronautics) under stable enough conditions. There is clarity there around what needs to be kept safe and how it can be kept safe (what needs to error detected and corrected for).
Unfortunately, the problem with “AGI” is that the code and hardware would keep getting reconfigured to function in new complex ways that cannot be contained by the original safeguards. That applies even to learning – the point is to internally integrate patterns from the outside world that were not understood before. So how are you going to have learning machinery anticipate how they will come to function differently once they learned patterns they do not understand / are unable to express yet?
Interesting. The second part seems like a claim some people in E/Accel would make.
The response is not that complicated: once the AI is no longer materially dependent on us, there are no longer dynamics of exchange there that would ensure they choose not to kill us. And the author seems to be confusing what lies at the basis of caring for oneself and others – coming to care for involves self-referential dynamics being selected for.
OK, I’ll be paraphrasing your position again, I trust that you will step in, if I’ve missed something.
Your key statements are something like
Every autopoietic control system is necessarily overwhelmed by evolutionary feedback.
and
No self-modifying learning system can guarantee anything about its future decision-making process.
But I just don’t see the argument for impossibility. In both cases, you have an intelligent system (or a society of them) trying to model and manage something. Whether or not it can succeed, seems to me just contingent. For some minds in some worlds, such problems will be tractable, for others, not.
I think without question we could exhibit toy worlds where those statements are not true. What is it about our real world that would make those problems intractable for all possible “minds”, no matter how good their control theory, and their ability to monitor and intervene in the world?
Great paraphrase!
This. There are fundamental limits to what system-propagated effects the system can control. And the portion of own effects the system can control decreases as the system scales in component complexity.
Yet, any of those effects that feed back into the continued/increased existence of components get selected for.
So there is a fundamental inequality here. No matter how “intelligent” the system is at pattern-transformation internally, it cannot intervene on all but a tiny portion of (possible) external evolutionary feedback on its constituent components.
Someone read this comment exchange.
They wrote back that Mitchell’s comments cleared up a lot of their confusion.
They also thought that the assertion that evolutionary pressures will overwhelm any efforts at control seems more asserted than proven.
Here is a longer explanation I gave on why there would be a fundamental inequality: