Okay, thanks for confirming that Newcomb’s problem is a relevant motivating example here.
“I don’t know how to solve this problem, nor do I understand the exact structure of the calculation my computer program will perform in the course of solving this problem, nor can I state a mathematically precise meta-question, but I’m going to rely on the AI solving it for me ’cause it’s supposed to be super-smart,”
I’m not saying that. I’m saying that self-modification solves the problem, assuming the CDT agent moves first, and that it seems simple enough that we can check that a not-very-smart AI solves it correctly on toy examples. If I get around to attempting that, I’ll post to LessWrong.
Assuming the CDT agent moves first seems reasonable. I have no clue whether or when Omega is going to show up, so I feel no need to second-guess the AI about that schedule.
(Quoting out of order)
This is not only ugly...
As you know, we can define a causal decision theory agent in one line of math. I don’t know a way to do that for TDT. Do you? If TDT could be concisely described, I’d agree that it’s the less ugly alternative.
but also has worse implications for e.g. meeting an alien AI who wants to cooperate with you, or worse, an alien AI that is trying to blackmail you.
I’m failing to suspend disbelief here. Do you have motivating examples for TDT that seem likely to happen before Kurzweil’s schedule for the Singularity causes us to either win or lose the game?
As you know, we can define a causal decision theory agent in one line of math.
If you appreciate simplicity/elegance, I suggest looking into UDT. UDT says that when you’re making a choice, you’re deciding the output of a particular computation, and the consequences of any given choice are just the logical consequences of that computation having that output.
CDT in contrast doesn’t answer the question “what am I actually deciding when I make a decision?” nor does it answer “what are the consequences of any particular choice?” even in principle. CDT can only be described in one line of math because the answer to the latter question has to be provided to it via an external parameter.
but also has worse implications for e.g. meeting an alien AI who wants to cooperate with you, or worse, an alien AI that is trying to blackmail you.
I’m failing to suspend disbelief here. Do you have motivating examples for TDT that seem likely to happen before Kurzweil’s schedule for the Singularity causes us to either win or lose the game?
I’m reasonably sure Eliezer meant implications for the would-be friendly AI meeting alien AIs. That could happen at any time in the remaining life span of the universe.
Okay, thanks for confirming that Newcomb’s problem is a relevant motivating example here.
I’m not saying that. I’m saying that self-modification solves the problem, assuming the CDT agent moves first, and that it seems simple enough that we can check that a not-very-smart AI solves it correctly on toy examples. If I get around to attempting that, I’ll post to LessWrong.
Assuming the CDT agent moves first seems reasonable. I have no clue whether or when Omega is going to show up, so I feel no need to second-guess the AI about that schedule.
(Quoting out of order)
As you know, we can define a causal decision theory agent in one line of math. I don’t know a way to do that for TDT. Do you? If TDT could be concisely described, I’d agree that it’s the less ugly alternative.
I’m failing to suspend disbelief here. Do you have motivating examples for TDT that seem likely to happen before Kurzweil’s schedule for the Singularity causes us to either win or lose the game?
If you appreciate simplicity/elegance, I suggest looking into UDT. UDT says that when you’re making a choice, you’re deciding the output of a particular computation, and the consequences of any given choice are just the logical consequences of that computation having that output.
CDT in contrast doesn’t answer the question “what am I actually deciding when I make a decision?” nor does it answer “what are the consequences of any particular choice?” even in principle. CDT can only be described in one line of math because the answer to the latter question has to be provided to it via an external parameter.
Thanks, I’ll have a look at UDT.
I certainly agree there.
Maybe this one: “Argmax[A in Actions] in SumO in Outcomes*P(this computation yields A []-> O|rest of universe)”
From this post.
I’m reasonably sure Eliezer meant implications for the would-be friendly AI meeting alien AIs. That could happen at any time in the remaining life span of the universe.