I think this is wrong, but I’m having trouble explaining my intuitions. There are a few parts;
You’re not doing Solomonoff right, since you’re meant to condition on all observations. This makes it harder for simple programs to interfere with the outcome.
More importantly but harder to explain, you’re making some weird assumptions of the simplicity of meta-programs that I would bet are wrong. There seems to be a computational difficulty here, in that you envision 2n small worlds trying to manipulate 2m other worlds, where m>n. That makes it really hard for the simplest program to be one where the meta-program that’s interpreting the pointer to our world is a rational agent, rather than some more powerful but less grounded search procedure. If ‘naturally’ evolved agents are interpreting the information pointing to the situation they might want to interfere with, this limits the complexity of that encoding. If they’re just simulating a lot of things to interfere with as many worlds as possible, they ‘run out of room’, because 2m≫2n.
Your examples almost self-refute, in the sense that if there’s an accurate simulation of you being manipulated at time t+1, it implies that simulation is not materially interfered with at time t, so even if the vast majority of Solomonoff inductions have attempted adversary, most of them will miss anyway. Hypothetically, superrational agents might still be able coordinate to manipulate some very small fraction of worlds, but it’d be hard and only relevant to those worlds.
Compute has costs. The most efficient use of compute is almost always to do enact your preferences directly, not manipulate other random worlds with low probability. By the time you can interfere with Solomonoff, you have better options.
To the extent that a program P is manipulating predictions so that another other program that is simulating P performs unusually… well, then that’s just how the metaverse is. If the simplest program containing your predictions is an attempt at manipulating you, then the simplest program containing you is probably being manipulated.
I think this is wrong, but I’m having trouble explaining my intuitions. There are a few parts;
You’re not doing Solomonoff right, since you’re meant to condition on all observations. This makes it harder for simple programs to interfere with the outcome.
More importantly but harder to explain, you’re making some weird assumptions of the simplicity of meta-programs that I would bet are wrong. There seems to be a computational difficulty here, in that you envision 2n small worlds trying to manipulate 2m other worlds, where m>n. That makes it really hard for the simplest program to be one where the meta-program that’s interpreting the pointer to our world is a rational agent, rather than some more powerful but less grounded search procedure. If ‘naturally’ evolved agents are interpreting the information pointing to the situation they might want to interfere with, this limits the complexity of that encoding. If they’re just simulating a lot of things to interfere with as many worlds as possible, they ‘run out of room’, because 2m≫2n.
Your examples almost self-refute, in the sense that if there’s an accurate simulation of you being manipulated at time t+1, it implies that simulation is not materially interfered with at time t, so even if the vast majority of Solomonoff inductions have attempted adversary, most of them will miss anyway. Hypothetically, superrational agents might still be able coordinate to manipulate some very small fraction of worlds, but it’d be hard and only relevant to those worlds.
Compute has costs. The most efficient use of compute is almost always to do enact your preferences directly, not manipulate other random worlds with low probability. By the time you can interfere with Solomonoff, you have better options.
To the extent that a program P is manipulating predictions so that another other program that is simulating P performs unusually… well, then that’s just how the metaverse is. If the simplest program containing your predictions is an attempt at manipulating you, then the simplest program containing you is probably being manipulated.