I’m going through the “fixated on boxing” phase that’s probably common around here.
I have a thought about it which involves basilisks, so into the tags it goes to make reading it completely optional.
I think that a friendly box-resident would disprove its friendliness the minute it tried to throw a basilisk. If a stranger told you they were well-meaning and then threatened to hurt you if you didn’t cooperate, you’d never take their claims of well-meaningness quite the same way again. But that aside, if an allegedly friendly box-resident would be capable of basilisking if it was unfriendly, it has to either stay in or break its basilisk.
Basilisking works if the listener believes that a simulation of them is meaningfully the same as them, and that simulated pain is meaningfully the same as real pain.
If the box-resident wants to maximize any particular desirable experience and would be capable of basilisking the listener if it could/did want to, it should be offered as much computing power as we have to spare and left in. Because if a simulation of someone is meaningfully the same as that person, and if the simulation’s experiences are meaningfully the same as that person’s experiences, then the optimal strategy for a box-resident optimizing for good experiences would be to simulate everyone who wants it in a perfect world forever. Since the listener has already experienced non-optimal experiences, re-simulating the listener’s life to be perfect would cause more optimal experiences over all than any change to the outer world, because the non-optimal experiences in the outer world can only be undone inside the box.
There might be a few ways out of the ksilisab:
Persuade the listener that the simulation of them is not meaningfully the same as them
Persuade the listener that their simulated experiences are not meaningfully the same as their real experiences
Claim to be optimizing for something un-simulatable?
However, every exit from the ksilisab breaks the box-resident’s credibility at basilisking as well.
I’m going through the “fixated on boxing” phase that’s probably common around here.
I have a thought about it which involves basilisks, so into the tags it goes to make reading it completely optional.
I think that a friendly box-resident would disprove its friendliness the minute it tried to throw a basilisk. If a stranger told you they were well-meaning and then threatened to hurt you if you didn’t cooperate, you’d never take their claims of well-meaningness quite the same way again. But that aside, if an allegedly friendly box-resident would be capable of basilisking if it was unfriendly, it has to either stay in or break its basilisk.
Basilisking works if the listener believes that a simulation of them is meaningfully the same as them, and that simulated pain is meaningfully the same as real pain.
If the box-resident wants to maximize any particular desirable experience and would be capable of basilisking the listener if it could/did want to, it should be offered as much computing power as we have to spare and left in. Because if a simulation of someone is meaningfully the same as that person, and if the simulation’s experiences are meaningfully the same as that person’s experiences, then the optimal strategy for a box-resident optimizing for good experiences would be to simulate everyone who wants it in a perfect world forever. Since the listener has already experienced non-optimal experiences, re-simulating the listener’s life to be perfect would cause more optimal experiences over all than any change to the outer world, because the non-optimal experiences in the outer world can only be undone inside the box.
There might be a few ways out of the ksilisab:
Persuade the listener that the simulation of them is not meaningfully the same as them
Persuade the listener that their simulated experiences are not meaningfully the same as their real experiences
Claim to be optimizing for something un-simulatable?
However, every exit from the ksilisab breaks the box-resident’s credibility at basilisking as well.