Thanks for the detailed response! I do think the framework can still work with my assumptions. The way I would model it would be something like:
In the first stage, we have G->Fremaining (the research to an AGI->FAI solution) and Gremaining (the research to enough AGI for UFAI). I expect G->Fremaining < Gremaining, and a relatively low leakage ratio.
after we have AGI->FAI, we have Fremaining (the research for the AGI to input to the AGI->FAI) and Gremaning (the research to enough AGI for UFAI). I expect Fremaining > Gremaining, and furthermore I expect the leakage ratio to be high enough that we are practically guaranteed to have enough AGI capabilities for UFAI before FAI (though I don’t know how long before). Hence the strategic importance of developing AGI capabilities in secret, and not having them lying around for too long in too many hands. I don’t really see a way of avoiding this: the alternative is to have enough research to create FAI but not a paperclip maximizer, which seems implausible (though it would be really nice if we could get this state!).
Also, it seems I had misinterpreted the part about rg and rf, sorry about that!
I guess the most controversial, and hopefully false, assumption of this paper is #3:
‘If Gremaining is reached before Fremaining, a UFAI will be created. If after, an FAI will be created.’
This basically is the AI Foom scenario, where the moment an AGI is created, it will either kill us or all or bring about utopia (or both).
If this is not the case, and we have a long time to work with the AGI as it develops to make sure it is friendly, then this model isn’t very useful.
If we do assume these assumptions, I would also expect that we will reach Gremaining before Fremaining, or at least that a private organization will end up doing so. However, I am also very skeptical in the power of secrets. I think I find us reaching Fremaining first more likely than a private institution reaching Gremaining first, but hiding it until it later reaches Fremaining, though both may be very slim. If the US military or a similar group with a huge technological and secretive advantage were doing this, there could be more of a chance. This definitely seems like a game of optimizing small probabilities.
Either way, I think we definitely would agree here that the organization developing these secrets can strategically choose projects that deliver the high amounts of FAI research relative to the amount AGI research they will have to keep secretive. Begin with the easy, non-secretive wins and work from there.
We may need the specific technology to create a paperclip maximizer before we make an FAI, but if we plan correctly, we hopefully will be really close to reaching an FAI by that point.
This basically is the AI Foom scenario, where the moment an AGI is created, it will either kill us or all or bring about utopia (or both).
The question is not “if”. The questions are “how quickly” and “to what height”. An AI capable of self-improving to world-destroying levels within moments is plainly unrealistic. An AI capable of self-improving to dangerous levels (viz: levels where it can start making humans do the dangerous work for it) in the weeks, months, or even years it would take a team of human operators to cross-examine the formally unspecified motivation engines for Friendliness is dangerously realistic.
Thanks for the detailed response! I do think the framework can still work with my assumptions. The way I would model it would be something like:
In the first stage, we have G->Fremaining (the research to an AGI->FAI solution) and Gremaining (the research to enough AGI for UFAI). I expect G->Fremaining < Gremaining, and a relatively low leakage ratio.
after we have AGI->FAI, we have Fremaining (the research for the AGI to input to the AGI->FAI) and Gremaning (the research to enough AGI for UFAI). I expect Fremaining > Gremaining, and furthermore I expect the leakage ratio to be high enough that we are practically guaranteed to have enough AGI capabilities for UFAI before FAI (though I don’t know how long before). Hence the strategic importance of developing AGI capabilities in secret, and not having them lying around for too long in too many hands. I don’t really see a way of avoiding this: the alternative is to have enough research to create FAI but not a paperclip maximizer, which seems implausible (though it would be really nice if we could get this state!).
Also, it seems I had misinterpreted the part about rg and rf, sorry about that!
Good point.
I guess the most controversial, and hopefully false, assumption of this paper is #3: ‘If Gremaining is reached before Fremaining, a UFAI will be created. If after, an FAI will be created.’
This basically is the AI Foom scenario, where the moment an AGI is created, it will either kill us or all or bring about utopia (or both).
If this is not the case, and we have a long time to work with the AGI as it develops to make sure it is friendly, then this model isn’t very useful.
If we do assume these assumptions, I would also expect that we will reach Gremaining before Fremaining, or at least that a private organization will end up doing so. However, I am also very skeptical in the power of secrets. I think I find us reaching Fremaining first more likely than a private institution reaching Gremaining first, but hiding it until it later reaches Fremaining, though both may be very slim. If the US military or a similar group with a huge technological and secretive advantage were doing this, there could be more of a chance. This definitely seems like a game of optimizing small probabilities.
Either way, I think we definitely would agree here that the organization developing these secrets can strategically choose projects that deliver the high amounts of FAI research relative to the amount AGI research they will have to keep secretive. Begin with the easy, non-secretive wins and work from there.
We may need the specific technology to create a paperclip maximizer before we make an FAI, but if we plan correctly, we hopefully will be really close to reaching an FAI by that point.
The question is not “if”. The questions are “how quickly” and “to what height”. An AI capable of self-improving to world-destroying levels within moments is plainly unrealistic. An AI capable of self-improving to dangerous levels (viz: levels where it can start making humans do the dangerous work for it) in the weeks, months, or even years it would take a team of human operators to cross-examine the formally unspecified motivation engines for Friendliness is dangerously realistic.