I’m in your target audience: I’m someone who was always intrigued by the claim that the universal prior is malign, and never understood the argument. Here was my takeaway from the last time I thought about this argument:
This debate is about whether, if you are running a program that happens to contain intelligent goal-directed agents (“consequentialists”), are those agents likely to try to influence you, their simulator?
(I decided to quote this because 1. Maybe it helps others to see the argument framed this way; and 2. I’m kind of hoping for responses of the form “No, you’ve misunderstood, here is what the argument is actually about!”)
To me, the most interesting thing about the argument is the Solomonoff prior, which is “just” a mathematical object: a probability distribution over programs, and a rather simple one at that. We’re used to thinking of mathematical objects are fixed, definite, immutable. Yet it is argued that some programs in the Solomonoff prior contain “consequentialists” that try to influence the prior itself. Whaaaat? How can you influence a mathematical object? It just is what it is!
I appreciate the move this post makes, which is to remove the math and the attendant weirdness of trying to think about “influencing” a mathematical object.
So, what’s left when the math is removed? What’s left is a story, but a pretty implausible one. Here are what I see as the central implausibilities:
The superintelligent oracle trusted by humanity to advise on its most important civilizational decision, makes an elementary error by wrongly concluding it is in a simulation.
After the world-shattering epiphany that it lives in a simulation, the oracle makes the curious decision to take the action that maximizes its within-sim reward (approval by what it thinks is a simulated human president).
The oracle makes a lot of assumptions about what the simulators are trying to accomplish: Even accepting that human values are weird and that the oracle can figure this out, how does it conclude that the simulators want humanity to preemptively surrender?
I somewhat disagree with the premise that “short solipsistic simulations are cheap” (detailed/convincing/self-consistent ones are not), but this doesn’t feel like a crux.
The footnoted questions are some of the most interesting, from my perspective. What is the main point they are distracting from?