Your idea is a little similar to one-action AI that I described a while ago. It is a neat way to get goal stability (as long as you trust your magical math intuition module), but it doesn’t seem to solve all of decision theory, even if you know the correct theory of physics at the outset.
If humanity eventually builds an AI based on your proposed decision theory, an alien AI built a million years ago may have inferred that fact and precommitted to wage self-destructive war against us unless we surrender 99% of our resources. Your AI will infer that fact at startup and give up the resources immediately. This predictable reaction of your AI caused the alien AI to make its precommitment in the first place. TDT tries to fix this by not yielding to extortion, though we don’t know how to formalize that.
If our universe contains other copies of us at different places and times (which is feasible if it’s spatially infinite), something weird can happen when copies of your AI interact with each other. UDT tries to fix that by accounting for copies “implicitly”, so they all end up cooperating.
Your idea is a little similar to one-action AI that I described a while ago. It is a neat way to get goal stability (as long as you trust your magical math intuition module), but it doesn’t seem to solve all of decision theory, even if you know the correct theory of physics at the outset.
If humanity eventually builds an AI based on your proposed decision theory, an alien AI built a million years ago may have inferred that fact and precommitted to wage self-destructive war against us unless we surrender 99% of our resources. Your AI will infer that fact at startup and give up the resources immediately. This predictable reaction of your AI caused the alien AI to make its precommitment in the first place. TDT tries to fix this by not yielding to extortion, though we don’t know how to formalize that.
If our universe contains other copies of us at different places and times (which is feasible if it’s spatially infinite), something weird can happen when copies of your AI interact with each other. UDT tries to fix that by accounting for copies “implicitly”, so they all end up cooperating.