Thane Ruthenis comments on The universal prior is malign

Thane Ruthenis 5 Mar 2023 15:26 UTC
LW: 2 AF: 1
0
AF
The problem is that the AI doesn’t a priori know the correct utility function, and whatever process it uses to discover that function is going to be attacked by Mu
I don’t understand the issue here. Mu can only interfere with the simulated AI’s process of utility-function discovery. If the AI follows the policy of “behave as if I’m outside the simulation”, AIs simulated by Mu will, sure, recover tampered utility functions. But AIs instantiated in the non-simulated universe, who deliberately avoid thinking about Mu/who discount simulation hypotheses, should just safely recover the untampered utility function. Mu can’t acausally influence you unless you deliberately open a channel to it.
I think I’m missing some part of the picture here. Is it assumed that any process of utility-function discovery has to somehow route through (something like) the unfiltered universal prior? Or that uncertainty with regards to one’s utility function means you can’t rule out the simulation hypothesis out of the gate, because it might be that what you genuinely care about is the simulators?
- Vanessa Kosoy 12 Mar 2023 14:46 UTC
  LW: 4 AF: 4
  0
  AF Parent
  The problem is that any useful prior must be based on Occam’s razor, and Occam’s razor + first-person POV creates the same problems as with the universal prior. And deliberately filtering out simulation hypotheses seems quite difficult, because it’s unclear to specify it. See also this.
  - Thane Ruthenis 12 Mar 2023 15:57 UTC
    LW: 2 AF: 1
    0
    AF Parent
    deliberately filtering out simulation hypotheses seems quite difficult, because it’s unclear to specify it
    Aha, that’s the difficulty I was overlooking. Specifically, I didn’t consider that the approach under consideration here requires us to formally define how we’re filtering them out.
    Thanks!