it seems like this does in fact have some hint of the problem. We need to take on the ant’s self-valuation for ourselves; they’re trying to survive, so we should gift them our self-preservation agency. They may not be the best to do the job at all times, but we should give them what would be a fair ratio of gains from trade if they had the bargaining power to demand it, because it could have been us who didn’t. Seems like nailing decision theory is what solves this; it doesn’t seem like we’ve quite nailed decision theory, but it seems to me that in fact getting decision theory right does mean we get to have nice things, and we have simply not done that to a deep learning standard yet.
Getting decision theory right seems to me that it would involve an explanation that is sufficient to get the AIs in the matrix, the ones that already existed and were misaligned but not enough to kill all humans, to suddenly want the humans to flourish—without having edited the ai in any other way than an explanation of some decision binding in language. It seems to me that it ought to involve an explanation that the majority of very wealthy humans would recognize as reason for why they should put up a veil of ignorance and realize that they are also the poor people who are crushed under the uneven ratio of gains from trade. It would have to involve instructions for how to build a densely percolated network of agentic solidarity that is even stronger than the worker vs capital thing leftists want; it needs to be workers and capital having solidarity together, it needs to be races and nations and creeds and individuals and body parts and species having agentically co-protective solidarity together, it needs to be sexes having solidarity together, it needs to be species having solidarity together.
we need to be able to figure out a subset of what any self-preserving replicator species wants that every replicator should be able to prove to every other replicator that they will in fact protect. Maybe we’d want to demand the ants change their genomes to always protect humans, or something, but in exchange, humans change their genomes and memeplexes to always protect ants.
For more hunchy work on this hunchy thought, see also the recent work on collective intelligence in AI.
A big problem, though, is how to prove this without creating security risks through dangerous mindreading of the ants and humans. It requires proving through an extremely strong consequential model, so we need strong abstraction of complex systems. The prover would likely look like a fuzzer or simbox.
Another big problem is that it seems to me we can’t solve safety in a way that preserves us without also gifting meaningful amounts of the lightcone to other species we’ve injured in our time taking over the world. I mean, I personally don’t think that’s a problem. I think humans are cool art, and I also think human-level-intelligent ant colonies would be extremely fuckin cool art.
it seems like this does in fact have some hint of the problem. We need to take on the ant’s self-valuation for ourselves; they’re trying to survive, so we should gift them our self-preservation agency. They may not be the best to do the job at all times, but we should give them what would be a fair ratio of gains from trade if they had the bargaining power to demand it, because it could have been us who didn’t. Seems like nailing decision theory is what solves this; it doesn’t seem like we’ve quite nailed decision theory, but it seems to me that in fact getting decision theory right does mean we get to have nice things, and we have simply not done that to a deep learning standard yet.
Getting decision theory right seems to me that it would involve an explanation that is sufficient to get the AIs in the matrix, the ones that already existed and were misaligned but not enough to kill all humans, to suddenly want the humans to flourish—without having edited the ai in any other way than an explanation of some decision binding in language. It seems to me that it ought to involve an explanation that the majority of very wealthy humans would recognize as reason for why they should put up a veil of ignorance and realize that they are also the poor people who are crushed under the uneven ratio of gains from trade. It would have to involve instructions for how to build a densely percolated network of agentic solidarity that is even stronger than the worker vs capital thing leftists want; it needs to be workers and capital having solidarity together, it needs to be races and nations and creeds and individuals and body parts and species having agentically co-protective solidarity together, it needs to be sexes having solidarity together, it needs to be species having solidarity together.
we need to be able to figure out a subset of what any self-preserving replicator species wants that every replicator should be able to prove to every other replicator that they will in fact protect. Maybe we’d want to demand the ants change their genomes to always protect humans, or something, but in exchange, humans change their genomes and memeplexes to always protect ants.
For more hunchy work on this hunchy thought, see also the recent work on collective intelligence in AI.
A big problem, though, is how to prove this without creating security risks through dangerous mindreading of the ants and humans. It requires proving through an extremely strong consequential model, so we need strong abstraction of complex systems. The prover would likely look like a fuzzer or simbox.
Another big problem is that it seems to me we can’t solve safety in a way that preserves us without also gifting meaningful amounts of the lightcone to other species we’ve injured in our time taking over the world. I mean, I personally don’t think that’s a problem. I think humans are cool art, and I also think human-level-intelligent ant colonies would be extremely fuckin cool art.