(I was musing about what it means for an incoherent lump of meat to “have preferences,” and thought it might be illuminating to consider what I’d do if I were, approximately, God. It, uh, hasn’t been illuminating yet, but it’s been entertaining and still seems at least potentially fruitful.)
Problem statement
You suddenly become omnipotent! Except, you can only do things that you understand in sufficient detail that you could accomplish it by micromanaging all the atoms involved. And, what the heck, assume you have effortless access to infinite computational power.
What do you do?
For concreteness, here are some interventions you might try:
“Pull out this chair”: allowed.
“Create an atomically-perfect duplicate of this strawberry”: allowed.
“Create friendly AI”: not allowed.
“Cure cancer”: not allowed. (“Prevent all mutations in all human DNA”: allowed. “Find all the cells in this person’s body containing this genetic marker and mangle all their mitochondria”: allowed. No need to worry about accidentally mis-specifying the marker and killing all their cells; if you were micromanaging the process, you’d notice that you were catching way too many cells.)
“Create a digital copy of everybody on Earth every nanosecond, in my infinite computer, so I can roll back anything that goes wrong, like people dying of diseases I don’t know how to fix yet”: allowed.
“Bioengineer tiny pet dragons”: not allowed.
This being LessWrong, you’ll probably quickly hit on some way to use ten billion sped-up simulated geniuses to speedrun AI alignment, build a friendly superintelligence, and delegate your Godlike power to it. But the purpose of this thought experiment is to elucidate your preferences, which that strategy—though very reasonable! -- dodges.
What I’d do
Object level
Just, like, the obvious. Slay Famine, Pestilence, and War. Stop accidents from happening. Scrap the solar system for parts and give everybody ultra-customizable space habitats connected by teleportation booths. (All this can be micromanaged by zillions of zillions of simulated clones of me.)
Let people opt out, obviously, in whole or in part.
There are still, to be clear, important wishes I can’t grant, such as “make me smarter” or “make my memory not degrade as I age” or “help me and my partner solve this relationship snarl.”
Meta level
Retain power:
Pause AI for fear it’ll create something that could hack me.
Smite anybody visibly plotting to manipulate me. (A very careful opponent could surely still play me at least a little bit, but “never give any significant external sign of your plotting” is a tight restriction.)
Don’t look too far into the stars, out of paranoia that I might see something that hacks me. Probably best to just freeze my entire lightcone outside the solar system.
Avoid value drift:
Ensure that none of my omnipresent simulated clones live for more than a few subjective days. Run them in a tree structure at different speeds, handing off summaries to their successors while monitored by their superiors, to allow important information to propagate forward in time even though each individual clone is short-lived.
Get smarter (without (much) value drift), so I can grant more wishes better:
If I ran the zoo
(I was musing about what it means for an incoherent lump of meat to “have preferences,” and thought it might be illuminating to consider what I’d do if I were, approximately, God. It, uh, hasn’t been illuminating yet, but it’s been entertaining and still seems at least potentially fruitful.)
Problem statement
You suddenly become omnipotent! Except, you can only do things that you understand in sufficient detail that you could accomplish it by micromanaging all the atoms involved. And, what the heck, assume you have effortless access to infinite computational power.
What do you do?
For concreteness, here are some interventions you might try:
“Pull out this chair”: allowed.
“Create an atomically-perfect duplicate of this strawberry”: allowed.
“Create friendly AI”: not allowed.
“Cure cancer”: not allowed. (“Prevent all mutations in all human DNA”: allowed. “Find all the cells in this person’s body containing this genetic marker and mangle all their mitochondria”: allowed. No need to worry about accidentally mis-specifying the marker and killing all their cells; if you were micromanaging the process, you’d notice that you were catching way too many cells.)
“Create a digital copy of everybody on Earth every nanosecond, in my infinite computer, so I can roll back anything that goes wrong, like people dying of diseases I don’t know how to fix yet”: allowed.
“Bioengineer tiny pet dragons”: not allowed.
This being LessWrong, you’ll probably quickly hit on some way to use ten billion sped-up simulated geniuses to speedrun AI alignment, build a friendly superintelligence, and delegate your Godlike power to it. But the purpose of this thought experiment is to elucidate your preferences, which that strategy—though very reasonable! -- dodges.
What I’d do
Object level
Just, like, the obvious. Slay Famine, Pestilence, and War. Stop accidents from happening. Scrap the solar system for parts and give everybody ultra-customizable space habitats connected by teleportation booths. (All this can be micromanaged by zillions of zillions of simulated clones of me.)
Let people opt out, obviously, in whole or in part.
There are still, to be clear, important wishes I can’t grant, such as “make me smarter” or “make my memory not degrade as I age” or “help me and my partner solve this relationship snarl.”
Meta level
Retain power:
Pause AI for fear it’ll create something that could hack me.
Smite anybody visibly plotting to manipulate me. (A very careful opponent could surely still play me at least a little bit, but “never give any significant external sign of your plotting” is a tight restriction.)
Don’t look too far into the stars, out of paranoia that I might see something that hacks me. Probably best to just freeze my entire lightcone outside the solar system.
Avoid value drift:
Ensure that none of my omnipresent simulated clones live for more than a few subjective days. Run them in a tree structure at different speeds, handing off summaries to their successors while monitored by their superiors, to allow important information to propagate forward in time even though each individual clone is short-lived.
Get smarter (without (much) value drift), so I can grant more wishes better:
???
(This, I think, is the important part!)