Tyrrell_McAllister comments on Rolf Nelson: How to deter a rogue AI by using your first-mover advantage

Tyrrell_McAllister 22 Nov 2010 21:51 UTC
3 points

The issue isn’t “status as a morally worthwhile person” but strategic position.

It’s true that a sim and an original could agree that what happens to the original is all that matters. Such a sim will sacrifice itself willingly if doing so implies a greater chance that the original gets what it wants.

But the essay in the OP stipulates what the RAI values, and these values mean that there is no strategic value to caring only about what the original gets. Both the sim and the original just want to see computation C completed. The completion of C is something that the RAI could complete in either possible world. Even if the RAI is a sim, its computation of C would be completely real. There would still be a real hardware substrate somewhere that ran through the computation of C, even if that hardware were simultaneously creating an AI who thinks that it’s 2040.

This means that the possibility that the sim will be aborted really is a threat to something that the RAI, in both instantiations, values. So, the RAI ought to weigh this possibility.
- JenniferRM 23 Nov 2010 3:04 UTC
  4 points
  Parent
  ::laughing with pleasure::
  
  Yes, in that particular contrived example the boundaries between daydream and real accomplishment are potentially blurred if the difficult accomplishment is to have successfully dreamed a particular thing.
  
  But while dreams within dreams are fun to play with, I don’t think that a coherent theory of simulationist metaphysics can ignore the fact that computation in a substrate is a physical process. Rolf’s RAI might have an incoherent theory of computation, but I suspect that any coherent theory is likely to take into account energy costs and computational reversibility and come out to some conclusion of the sort that “computing C” ultimately has a physical meaning something along the lines of either “storing the output of process C in a particular configuration in a particular medium” or perhaps “erasing information from a particular medium in a C-outcome-indicating manner”?
  
  If we simulate conway’s life running a Turing machine computing its calculation it seems reasonable that it would count that as the calculation happening in both the conway’s life simulation and in our computer, but to me this just highlights the enormous distance between Rolf’s simulated and real RAI.
  
  Maybe if the RAI was OK with the computational medium being in some other quantum narrative that branched off of its own many years previously then it would be amenable to inter-narrative trade after hearing from the ambassador of the version of Rolf who is actually capable of running the simulation? Basically it would have to be willing to say “I’ll spare you here at the expense of a less than optimal computational result if copies of you that won in other narratives run my computation for me over there.”
  
  But this feels like the corner case of a corner case to me in terms of robust solutions to rogue computers. The RAI would require a very specific sort of goal that’s amenable to a very specific sort of trade. And a straight trade does not require the RAI to be ignorant about whether it is really in control of the substrate universe or if it is in a simulated sandbox universe: it just requires (1) a setup where the trade is credible and (2) an RAI that actually counting very distant quantum narratives as “real enough to trade for”.
  
  Finally, actually propitiating every possible such RAIs is getting into busy beaver territory in terms of infeasible computational costs in the positive futures where a computationally focused paperclipping monster turns out not to have eaten the world.
  
  It would be kind of ironic if we end up with a positive singularity… and then end up spending all our resources simulating “everything” that could have happened in the disaster scenarios...