It has >99% probability of wireheading, and in >99% of the remaining outcomes it disassembles itself with its mining claws.
Wireheading is just what reinforcement learning agents are built to do, so it’s not actually a problem. Hurting own hardware because anticipation admits quantum suicide is partially the same problem as relying on explicit dependencies, although it’s still hard to define reward, to count the worlds that include or not include your instance (with given reward-observations), but this should be solved in any case for UDT-AIXI, and the only way to solve this that I see (which doesn’t involve privileging particular physical implementation of the agent, but accepts it on any substrate within a world-program) again involves looking for ambient dependencies (namely, see if a dependence is present, and count a world-program only if it is).
So these problems are also automatically taken care of in UDT-AIXI, to the extent they are problematic.
Wireheading is just what reinforcement learning agents are built to do, so it’s not actually a problem.
This comment led me to the following tangential train of thought: AIXI seems to capture the essence of reinforcement learning, but does not feel pain or pleasure. I do not feel morally compelled to help an AIXI-like agent (as opposed to a human) gain positive reinforcements and avoid negative reinforcements (unless it was some part of a trade).
After writing the above, I found this old comment of yours, which seems closely related. But thinking about an AIXI-like agent that has only “wants” and no “likes”, I feel myself being pulled towards what you called the “naive view”. Do you have any further thoughts on this subject?
Wireheading is just what reinforcement learning agents are built to do, so it’s not actually a problem. Hurting own hardware because anticipation admits quantum suicide is partially the same problem as relying on explicit dependencies, although it’s still hard to define reward, to count the worlds that include or not include your instance (with given reward-observations), but this should be solved in any case for UDT-AIXI, and the only way to solve this that I see (which doesn’t involve privileging particular physical implementation of the agent, but accepts it on any substrate within a world-program) again involves looking for ambient dependencies (namely, see if a dependence is present, and count a world-program only if it is).
So these problems are also automatically taken care of in UDT-AIXI, to the extent they are problematic.
This comment led me to the following tangential train of thought: AIXI seems to capture the essence of reinforcement learning, but does not feel pain or pleasure. I do not feel morally compelled to help an AIXI-like agent (as opposed to a human) gain positive reinforcements and avoid negative reinforcements (unless it was some part of a trade).
After writing the above, I found this old comment of yours, which seems closely related. But thinking about an AIXI-like agent that has only “wants” and no “likes”, I feel myself being pulled towards what you called the “naive view”. Do you have any further thoughts on this subject?