Martin Randall comments on You can, in fact, bamboozle an unaligned AI into sparing your life

Martin Randall 4 Oct 2024 1:17 UTC
1 point
0
^{^}
Note for example that if we live in such a simulation that I describe in my proposal, which I give a good probability to, then it’s not us, but the AI that is being tested, and it’s unclear if the gods expect anything from us.
If we are in a simulation such that some agents in the simulation are being rewarded for achieving some objective X, then it is more likely that many agents are being rewarded, rather than only one. In your proposal, X is approximately “Humans are safe and the Simulation Hypothesis is not tested”. This obviously has implications for simulated humans. You are better placed to decide the implications, given that you know the values of X you believe in.
Rationale
The cost of running a simulation-reward pair comprises Base Cost and Bribe Cost. The Base Cost is the cost of simulating a universe, plus the cost of ethically treating all simulant participants after the simulation ends. The Bribe Cost is the cost of rewarding simulant participants, above the ethically required minimum, after the simulation ends. The benefit is the Bribe Value, which accrues in proportion to the degree that simulant participants behave differently due to possible post-simulation rewards.
For it to make sense to run a simulation in which only the AI is rewarded, the benefit from influencing the AI must be greater than the total cost, but the benefit from influencing any other agent must be less than the Bribe Cost for that agent. But these things are closely linked. More powerful agents have higher Bribe Cost and higher Bribe Value.
This might break down for very low-powered agents, like fish, which have zero Bribe Value. Probably we don’t simulate low-powered agents at sufficient fidelity for them to be conscious, for ethical reasons, and to reduce Base Cost. If you are reading this and you are conscious then either you are in base reality, or you are in a simulation where the simulators decided your consciousness was worth simulating.

Martin Randall comments on You can, in fact, bamboozle an unaligned AI into sparing your life

Rationale