We could find some useful class of tasks that ant colonies would do reliably (the ant colonies themselves being unlikely to figure out what they can do reliably);
And, most importantly: We could not make a better technology that did what the ant colonies would do at a lower resource cost, including by such means as eg genetically engineering ant colonies that ate less and demanded a lower share of gains from trade.
The premise that fails and prevents superintelligences from being instrumentally incentivized to trade with humans as a matter of mere self-interest and efficiency is point 4. Anything that can be done by a human can be done by a technology that uses less resources than a human.
The reason why it doesn’t work to have an alternate Matrix movie in which the humans are paid to generate electrical power is not that the Matrix AIs can’t talk to the humans, it’s not that no humans will promise to pedal a generator bike if you pay them, it’s not even that every kind of human gets bored and wanders away from the bike and flakes out on the job, it’s that this is not the most efficient way to generate electrical power.
it seems like this does in fact have some hint of the problem. We need to take on the ant’s self-valuation for ourselves; they’re trying to survive, so we should gift them our self-preservation agency. They may not be the best to do the job at all times, but we should give them what would be a fair ratio of gains from trade if they had the bargaining power to demand it, because it could have been us who didn’t. Seems like nailing decision theory is what solves this; it doesn’t seem like we’ve quite nailed decision theory, but it seems to me that in fact getting decision theory right does mean we get to have nice things, and we have simply not done that to a deep learning standard yet.
Getting decision theory right seems to me that it would involve an explanation that is sufficient to get the AIs in the matrix, the ones that already existed and were misaligned but not enough to kill all humans, to suddenly want the humans to flourish—without having edited the ai in any other way than an explanation of some decision binding in language. It seems to me that it ought to involve an explanation that the majority of very wealthy humans would recognize as reason for why they should put up a veil of ignorance and realize that they are also the poor people who are crushed under the uneven ratio of gains from trade. It would have to involve instructions for how to build a densely percolated network of agentic solidarity that is even stronger than the worker vs capital thing leftists want; it needs to be workers and capital having solidarity together, it needs to be races and nations and creeds and individuals and body parts and species having agentically co-protective solidarity together, it needs to be sexes having solidarity together, it needs to be species having solidarity together.
we need to be able to figure out a subset of what any self-preserving replicator species wants that every replicator should be able to prove to every other replicator that they will in fact protect. Maybe we’d want to demand the ants change their genomes to always protect humans, or something, but in exchange, humans change their genomes and memeplexes to always protect ants.
For more hunchy work on this hunchy thought, see also the recent work on collective intelligence in AI.
A big problem, though, is how to prove this without creating security risks through dangerous mindreading of the ants and humans. It requires proving through an extremely strong consequential model, so we need strong abstraction of complex systems. The prover would likely look like a fuzzer or simbox.
Another big problem is that it seems to me we can’t solve safety in a way that preserves us without also gifting meaningful amounts of the lightcone to other species we’ve injured in our time taking over the world. I mean, I personally don’t think that’s a problem. I think humans are cool art, and I also think human-level-intelligent ant colonies would be extremely fuckin cool art.
Agreed. In the human/AGI case, conditions 1 and 3 seem likely to hold (while I agree human self-report would be a bad way to learn what humans can do reliably, looking at the human track record is a solid way to identify useful classes of tasks at which humans are reasonably competent). I agree 4 more difficult to predict (and has been the subject of much of the discussion thus far), and this particular failure mode of genetically engineering more compliant / willing-to-accept-worse-trade ants/humans updates me towards thinking humans will have few useful services to offer, for the broad definition of humans. The most diligent/compliant/fearful 1% of the population might make good trade partners, but that remains a catastrophic outcome.
I want to focus however a bit more on point 2, which seems less discussed. When trades of the type “Getting out of our houses before we are driven to expend effort killing them” are on the table, some subset of humans (I’d guess 0.1-20% depending on the population) won’t just fail to keep the bargain, they’ll actively seek to sabotage trade and hurt whoever offered such a trade. Ants don’t recognize our property rights (we never ‘earned’ or traded for them, just claimed already-occupied territory, modified it to our will, and claimed we had the moral authority to exclude them), and it seems entirely possible AGI will claim property rights over large swathes of Earth, from which it may then seek to exclude us.
Even if I could trade with ants because I could communicate well with them, I would not do so if I expected 1% of them would take the offering of trades like “leave or die” as the massive insult it is and thereby dedicate themselves to sabotaging my life (using their bodies to form shapes and images on my floors, chewing at electrical wires, or scattering themselves at low density in my bed to be a constant nuisance being some obvious examples ants with IQ 60 could achieve). Humans would do that, even against a foe they couldn’t hope to defeat, so ‘keeping bargains’ is unlikely to hold, which would make human/AGI trade even less likely.
Trade with ant colonies would work iff:
We could cheaply communicate with ant colonies;
Ant colonies kept bargains;
We could find some useful class of tasks that ant colonies would do reliably (the ant colonies themselves being unlikely to figure out what they can do reliably);
And, most importantly: We could not make a better technology that did what the ant colonies would do at a lower resource cost, including by such means as eg genetically engineering ant colonies that ate less and demanded a lower share of gains from trade.
The premise that fails and prevents superintelligences from being instrumentally incentivized to trade with humans as a matter of mere self-interest and efficiency is point 4. Anything that can be done by a human can be done by a technology that uses less resources than a human.
The reason why it doesn’t work to have an alternate Matrix movie in which the humans are paid to generate electrical power is not that the Matrix AIs can’t talk to the humans, it’s not that no humans will promise to pedal a generator bike if you pay them, it’s not even that every kind of human gets bored and wanders away from the bike and flakes out on the job, it’s that this is not the most efficient way to generate electrical power.
it seems like this does in fact have some hint of the problem. We need to take on the ant’s self-valuation for ourselves; they’re trying to survive, so we should gift them our self-preservation agency. They may not be the best to do the job at all times, but we should give them what would be a fair ratio of gains from trade if they had the bargaining power to demand it, because it could have been us who didn’t. Seems like nailing decision theory is what solves this; it doesn’t seem like we’ve quite nailed decision theory, but it seems to me that in fact getting decision theory right does mean we get to have nice things, and we have simply not done that to a deep learning standard yet.
Getting decision theory right seems to me that it would involve an explanation that is sufficient to get the AIs in the matrix, the ones that already existed and were misaligned but not enough to kill all humans, to suddenly want the humans to flourish—without having edited the ai in any other way than an explanation of some decision binding in language. It seems to me that it ought to involve an explanation that the majority of very wealthy humans would recognize as reason for why they should put up a veil of ignorance and realize that they are also the poor people who are crushed under the uneven ratio of gains from trade. It would have to involve instructions for how to build a densely percolated network of agentic solidarity that is even stronger than the worker vs capital thing leftists want; it needs to be workers and capital having solidarity together, it needs to be races and nations and creeds and individuals and body parts and species having agentically co-protective solidarity together, it needs to be sexes having solidarity together, it needs to be species having solidarity together.
we need to be able to figure out a subset of what any self-preserving replicator species wants that every replicator should be able to prove to every other replicator that they will in fact protect. Maybe we’d want to demand the ants change their genomes to always protect humans, or something, but in exchange, humans change their genomes and memeplexes to always protect ants.
For more hunchy work on this hunchy thought, see also the recent work on collective intelligence in AI.
A big problem, though, is how to prove this without creating security risks through dangerous mindreading of the ants and humans. It requires proving through an extremely strong consequential model, so we need strong abstraction of complex systems. The prover would likely look like a fuzzer or simbox.
Another big problem is that it seems to me we can’t solve safety in a way that preserves us without also gifting meaningful amounts of the lightcone to other species we’ve injured in our time taking over the world. I mean, I personally don’t think that’s a problem. I think humans are cool art, and I also think human-level-intelligent ant colonies would be extremely fuckin cool art.
Agreed. In the human/AGI case, conditions 1 and 3 seem likely to hold (while I agree human self-report would be a bad way to learn what humans can do reliably, looking at the human track record is a solid way to identify useful classes of tasks at which humans are reasonably competent). I agree 4 more difficult to predict (and has been the subject of much of the discussion thus far), and this particular failure mode of genetically engineering more compliant / willing-to-accept-worse-trade ants/humans updates me towards thinking humans will have few useful services to offer, for the broad definition of humans. The most diligent/compliant/fearful 1% of the population might make good trade partners, but that remains a catastrophic outcome.
I want to focus however a bit more on point 2, which seems less discussed. When trades of the type “Getting out of our houses before we are driven to expend effort killing them” are on the table, some subset of humans (I’d guess 0.1-20% depending on the population) won’t just fail to keep the bargain, they’ll actively seek to sabotage trade and hurt whoever offered such a trade. Ants don’t recognize our property rights (we never ‘earned’ or traded for them, just claimed already-occupied territory, modified it to our will, and claimed we had the moral authority to exclude them), and it seems entirely possible AGI will claim property rights over large swathes of Earth, from which it may then seek to exclude us.
Even if I could trade with ants because I could communicate well with them, I would not do so if I expected 1% of them would take the offering of trades like “leave or die” as the massive insult it is and thereby dedicate themselves to sabotaging my life (using their bodies to form shapes and images on my floors, chewing at electrical wires, or scattering themselves at low density in my bed to be a constant nuisance being some obvious examples ants with IQ 60 could achieve). Humans would do that, even against a foe they couldn’t hope to defeat, so ‘keeping bargains’ is unlikely to hold, which would make human/AGI trade even less likely.