I think the expected gain from pursuing FAI is less that pursuing other methods. Other methods are less likely to work, but more likely to be implementable.
I assume that by “implementable” you mean that it’s an actionable project, that might fail to “work”, i.e. deliver the intended result. I don’t see how “implementability” is a relevant characteristic. What matters is whether something works, i.e. succeeds. If you think that other methods are less likely to work, how are they of greater expected value? I probably parsed some of your terms incorrectly.
Whether the project reached the desired goal, versus whether that goal will actually work. If Nick and Eliezer both agreed about some design that “this is how you build a FAI”, then I expect it will work. However, I don’t think it’s likely that would happen. It’s more likely they will say “this is how you build a proper Oracle AI”, but less likely the Oracle will end up being safe.
Whether the project reached the desired goal, versus whether that goal will actually work.
Okay, but I still don’t understand how a project with lower probability of “actually working” can be of higher expected value. I’m referring to this statement:
I think the expected gain from pursuing FAI is less that pursuing other methods. Other methods are less likely to work...
The argument you seem to be giving in support of higher expected value of other methods is that they are “more likely to be implementable” (a project reaching its stated goal, even if that goal turns out to be no good), but I don’t see how is that an interesting property.
He didn’t say other architectures would be no good, he said they’re less likely to be safe.
He thinks the distribution P(Outcome | do(complete Oracle AI project)) isn’t as highly peaked at Weirdtopia as P(outcome | do(complete FAI)); Oracle AI puts more weight on regions like “Lifeless universe”, “Eternal Torture”, “Rainbows and Slow Death”, and “Failed Utopia”.
However, “Complete FAI” isn’t an actionable procedure, so he examines the chance of completion conditional on different actions he can take. “Not worth pursuing because non-implementable” means that available FAI supporting actions don’t have a reasonable chance of producing friendly AI, which discounts the peak in the conditional outcome distribution at valuable futures relative to do(complete FAI). And supposedly he has some other available oracle AI supporting strategy which fares better.
Eating a sandwich isn’t as cool as building an interstellar society with wormholes for transportation, but I’m still going to make a sandwich for lunch, because it’s going to work and maybe be okay-ish.
I assume that by “implementable” you mean that it’s an actionable project, that might fail to “work”, i.e. deliver the intended result. I don’t see how “implementability” is a relevant characteristic. What matters is whether something works, i.e. succeeds. If you think that other methods are less likely to work, how are they of greater expected value? I probably parsed some of your terms incorrectly.
Whether the project reached the desired goal, versus whether that goal will actually work. If Nick and Eliezer both agreed about some design that “this is how you build a FAI”, then I expect it will work. However, I don’t think it’s likely that would happen. It’s more likely they will say “this is how you build a proper Oracle AI”, but less likely the Oracle will end up being safe.
Okay, but I still don’t understand how a project with lower probability of “actually working” can be of higher expected value. I’m referring to this statement:
The argument you seem to be giving in support of higher expected value of other methods is that they are “more likely to be implementable” (a project reaching its stated goal, even if that goal turns out to be no good), but I don’t see how is that an interesting property.
He didn’t say other architectures would be no good, he said they’re less likely to be safe.
He thinks the distribution P(Outcome | do(complete Oracle AI project)) isn’t as highly peaked at Weirdtopia as P(outcome | do(complete FAI)); Oracle AI puts more weight on regions like “Lifeless universe”, “Eternal Torture”, “Rainbows and Slow Death”, and “Failed Utopia”.
However, “Complete FAI” isn’t an actionable procedure, so he examines the chance of completion conditional on different actions he can take. “Not worth pursuing because non-implementable” means that available FAI supporting actions don’t have a reasonable chance of producing friendly AI, which discounts the peak in the conditional outcome distribution at valuable futures relative to do(complete FAI). And supposedly he has some other available oracle AI supporting strategy which fares better.
Eating a sandwich isn’t as cool as building an interstellar society with wormholes for transportation, but I’m still going to make a sandwich for lunch, because it’s going to work and maybe be okay-ish.