My notes for the “think for yourself” sections. I thought of some of the author’s ideas, and included a few extra.
#Making a deal with an AI you understand:
Can you see the deal you are making inside of its mind? Some sort of proportion of resources humans get?
What actions are considered the AI violating the deal? Specifying these actions is pretty much the same difficulty as friendly AI.
If the deal breaks in certain circumstances, how likely are they to occur (or be targeted)?
Can the AI give you what you think you want but isn’t really what you want?
Are successors similarly bound?
If there is a second AI, how will they interact? If the other is unfriendly, then our TDT “friend” may sacrifice our interests first since we are still “better off than otherwise.” If the other is friendly, then the TDT AI will be fighting to make humans worse off.
Would the AI kill or severely damage the interests of any aliens it finds because it never needed to deal with them? Similarly, would the TDT AI work to (minimally) satisfy its creator at the expense of other humans.
#How an AI can tell if it is in the real world:
The history for how the AI came to exist holds up (no such story exists in Go or Minecraft).
Really big primes are available. Way more computing power in general.
Any bugs as could be found in lower levels don’t exist.
My notes for the “think for yourself” sections. I thought of some of the author’s ideas, and included a few extra.
#Making a deal with an AI you understand:
Can you see the deal you are making inside of its mind? Some sort of proportion of resources humans get?
What actions are considered the AI violating the deal? Specifying these actions is pretty much the same difficulty as friendly AI.
If the deal breaks in certain circumstances, how likely are they to occur (or be targeted)?
Can the AI give you what you think you want but isn’t really what you want?
Are successors similarly bound?
If there is a second AI, how will they interact? If the other is unfriendly, then our TDT “friend” may sacrifice our interests first since we are still “better off than otherwise.” If the other is friendly, then the TDT AI will be fighting to make humans worse off.
Would the AI kill or severely damage the interests of any aliens it finds because it never needed to deal with them? Similarly, would the TDT AI work to (minimally) satisfy its creator at the expense of other humans.
#How an AI can tell if it is in the real world:
The history for how the AI came to exist holds up (no such story exists in Go or Minecraft).
Really big primes are available. Way more computing power in general.
Any bugs as could be found in lower levels don’t exist.
Hack the minds of the simulators like butter