Xia: It should be relatively easy to give AIXI(tl) evidence that its selected actions are useless when its motor is dead. If nothing else AIXI(tl) should be able to learn that it’s bad to let its body be destroyed, because then its motor will be destroyed, which experience tells it causes its actions to have less of an impact on its reward inputs.
Rob B: [...] Even if we get AIXI(tl) to value continuing to affect the world, it’s not clear that it would preserve itself. It might well believe that it can continue to have a causal impact on our world (or on some afterlife world) by a different route after its body is destroyed. Perhaps it will be able to lift heavier objects telepathically, since its clumsy robot body is no longer getting in the way of its output sequence.
Compare human immortalists who think that partial brain damage impairs mental functioning, but complete brain damage allows the mind to escape to a better place. Humans don’t find it inconceivable that there’s a light at the end of the low-reward tunnel, and we have death in our hypothesis space!
I’d like to see this rebuttal spelled out in more detail. Let’s assume for the sake of argument that “we get AIXI(tl) to value continuing to affect the world”. Why would it then be so hard to convince AIXI(tl) that it will be better able to affect the world if no anvils fall on a certain head? (I mean, hard compared to any reasonably-hoped-for alternative to AIXI(tl)?)
If the AIXI(tl) robot has been kept from killing itself for long enough for it to observe the basics of how the world works, why wouldn’t AIXI(tl) have noticed that it is better able to affect the world when a certain brain and body are in good working order and free of obstructions? We can’t give AIXI(tl) the experience of dying, but can’t we can give AIXI(tl) experiences supporting the hypothesis that damage to a particular body causes AIXI(tl) to be unable to affect the world as well as it would like?
I can see that AIXI(tl) would entertain hypotheses like, “Maybe dropping this anvil on this brain will make things better.” But AIXI(tl) would also entertain the contrary hypothesis, that dropping the anvil will make things worse, not because it will turn the perceptual stream into an unending sequence of NULLs, but rather because smashing the brain might make it harder for AIXI(tl) to steer the future.
Humans did invent hypotheses like, “complete brain damage allows the mind to escape to a better place”, but there seems to be a strong case for the claim that humans are far more confident in such hypotheses than they should be, given the evidence. Shouldn’t a Solomonoff inductor do a much better job at weighing this evidence than humans do? Why wouldn’t AIXI(tl)’s enthusiasm for the “better place” hypothesis be outweighed by a fear of becoming a disembodied Cartesian spirit cut off from all influence over the only world that it cares about influencing?
I’d like to see this rebuttal spelled out in more detail. Let’s assume for the sake of argument that “we get AIXI(tl) to value continuing to affect the world”. Why would it then be so hard to convince AIXI(tl) that it will be better able to affect the world if no anvils fall on a certain head? (I mean, hard compared to any reasonably-hoped-for alternative to AIXI(tl)?)
If the AIXI(tl) robot has been kept from killing itself for long enough for it to observe the basics of how the world works, why wouldn’t AIXI(tl) have noticed that it is better able to affect the world when a certain brain and body are in good working order and free of obstructions? We can’t give AIXI(tl) the experience of dying, but can’t we can give AIXI(tl) experiences supporting the hypothesis that damage to a particular body causes AIXI(tl) to be unable to affect the world as well as it would like?
I can see that AIXI(tl) would entertain hypotheses like, “Maybe dropping this anvil on this brain will make things better.” But AIXI(tl) would also entertain the contrary hypothesis, that dropping the anvil will make things worse, not because it will turn the perceptual stream into an unending sequence of NULLs, but rather because smashing the brain might make it harder for AIXI(tl) to steer the future.
Humans did invent hypotheses like, “complete brain damage allows the mind to escape to a better place”, but there seems to be a strong case for the claim that humans are far more confident in such hypotheses than they should be, given the evidence. Shouldn’t a Solomonoff inductor do a much better job at weighing this evidence than humans do? Why wouldn’t AIXI(tl)’s enthusiasm for the “better place” hypothesis be outweighed by a fear of becoming a disembodied Cartesian spirit cut off from all influence over the only world that it cares about influencing?