[Edited: replaced Gremaining with Fremaining, which is what I originally meant]
Thanks for the comment jessicat! I haven’t read those posts yet, will do more research on reducing FAI to an AGI problem.
A few responses & clarifications:
Our framework assumes the FAI research would happen before AGI creation. If we can research how to reduce FAI to an AGI problem in a way that would reliably make a future AGI friendly, then that amount of research would be our variable Fremaining. If that is quite easy to do, then that’s fantastic; an AI venture would have an easy time, and the leakage ratio would be low enough to not have to worry about. Additional required capabilities that we’ll find out we need would be added to Fremaining.
“I think the post fails to accurately model these difficulties.”
→ This post doesn’t attempt to model the individual challenges to understand how large Fremaining actually is. That’s probably a more important question than what we addressed, but one for a different model.
“The right answer here is to get AGI researchers to develop (and not publish anything about) enough AGI capabilities for FAI without running a UFAI in the meantime, even though the capabilities to run it exist.”
→ This paper definitely advocates for AGI researchers to develop FAI research while not publishing much AGI research. I agree that some internal AGI research will probably be necessary, but hope that it won’t be a whole lot. If the tools to create an AGI were figured out, even if they were kept secret by an FAI research group, I would be very scared. Those would be the most important and dangerous secrets of all time, and I doubt they could be kept secret for very long (20 years max?)
“In this case, the model in the post seems to be mostly accurate, except that it neglects the fact that serial advances might be important (so we get diminishing marginal progress towards FAI or AGI per additional researcher in a given year).”
-> This paper purposefully didn’t model research effort, but rather, abstract units of research significance.
“the numbers of rg and rf don’t perfectly correlate with the difficulty to reach them. It may be that we have diminishing marginal returns with our current levels of rg, so similar levels of rf will be easier to reach.”
A model that would also take into account the effort required would require a few more assumptions and additional complexity. I prefer to start simple and work from there, so we at least know what people do agree on before adding additional complexity.
Thanks for the detailed response! I do think the framework can still work with my assumptions. The way I would model it would be something like:
In the first stage, we have G->Fremaining (the research to an AGI->FAI solution) and Gremaining (the research to enough AGI for UFAI). I expect G->Fremaining < Gremaining, and a relatively low leakage ratio.
after we have AGI->FAI, we have Fremaining (the research for the AGI to input to the AGI->FAI) and Gremaning (the research to enough AGI for UFAI). I expect Fremaining > Gremaining, and furthermore I expect the leakage ratio to be high enough that we are practically guaranteed to have enough AGI capabilities for UFAI before FAI (though I don’t know how long before). Hence the strategic importance of developing AGI capabilities in secret, and not having them lying around for too long in too many hands. I don’t really see a way of avoiding this: the alternative is to have enough research to create FAI but not a paperclip maximizer, which seems implausible (though it would be really nice if we could get this state!).
Also, it seems I had misinterpreted the part about rg and rf, sorry about that!
I guess the most controversial, and hopefully false, assumption of this paper is #3:
‘If Gremaining is reached before Fremaining, a UFAI will be created. If after, an FAI will be created.’
This basically is the AI Foom scenario, where the moment an AGI is created, it will either kill us or all or bring about utopia (or both).
If this is not the case, and we have a long time to work with the AGI as it develops to make sure it is friendly, then this model isn’t very useful.
If we do assume these assumptions, I would also expect that we will reach Gremaining before Fremaining, or at least that a private organization will end up doing so. However, I am also very skeptical in the power of secrets. I think I find us reaching Fremaining first more likely than a private institution reaching Gremaining first, but hiding it until it later reaches Fremaining, though both may be very slim. If the US military or a similar group with a huge technological and secretive advantage were doing this, there could be more of a chance. This definitely seems like a game of optimizing small probabilities.
Either way, I think we definitely would agree here that the organization developing these secrets can strategically choose projects that deliver the high amounts of FAI research relative to the amount AGI research they will have to keep secretive. Begin with the easy, non-secretive wins and work from there.
We may need the specific technology to create a paperclip maximizer before we make an FAI, but if we plan correctly, we hopefully will be really close to reaching an FAI by that point.
This basically is the AI Foom scenario, where the moment an AGI is created, it will either kill us or all or bring about utopia (or both).
The question is not “if”. The questions are “how quickly” and “to what height”. An AI capable of self-improving to world-destroying levels within moments is plainly unrealistic. An AI capable of self-improving to dangerous levels (viz: levels where it can start making humans do the dangerous work for it) in the weeks, months, or even years it would take a team of human operators to cross-examine the formally unspecified motivation engines for Friendliness is dangerously realistic.
[Edited: replaced Gremaining with Fremaining, which is what I originally meant]
Thanks for the comment jessicat! I haven’t read those posts yet, will do more research on reducing FAI to an AGI problem.
A few responses & clarifications:
Our framework assumes the FAI research would happen before AGI creation. If we can research how to reduce FAI to an AGI problem in a way that would reliably make a future AGI friendly, then that amount of research would be our variable Fremaining. If that is quite easy to do, then that’s fantastic; an AI venture would have an easy time, and the leakage ratio would be low enough to not have to worry about. Additional required capabilities that we’ll find out we need would be added to Fremaining.
“I think the post fails to accurately model these difficulties.” → This post doesn’t attempt to model the individual challenges to understand how large Fremaining actually is. That’s probably a more important question than what we addressed, but one for a different model.
“The right answer here is to get AGI researchers to develop (and not publish anything about) enough AGI capabilities for FAI without running a UFAI in the meantime, even though the capabilities to run it exist.” → This paper definitely advocates for AGI researchers to develop FAI research while not publishing much AGI research. I agree that some internal AGI research will probably be necessary, but hope that it won’t be a whole lot. If the tools to create an AGI were figured out, even if they were kept secret by an FAI research group, I would be very scared. Those would be the most important and dangerous secrets of all time, and I doubt they could be kept secret for very long (20 years max?)
“In this case, the model in the post seems to be mostly accurate, except that it neglects the fact that serial advances might be important (so we get diminishing marginal progress towards FAI or AGI per additional researcher in a given year).”
-> This paper purposefully didn’t model research effort, but rather, abstract units of research significance. “the numbers of rg and rf don’t perfectly correlate with the difficulty to reach them. It may be that we have diminishing marginal returns with our current levels of rg, so similar levels of rf will be easier to reach.”
A model that would also take into account the effort required would require a few more assumptions and additional complexity. I prefer to start simple and work from there, so we at least know what people do agree on before adding additional complexity.
Thanks for the detailed response! I do think the framework can still work with my assumptions. The way I would model it would be something like:
In the first stage, we have G->Fremaining (the research to an AGI->FAI solution) and Gremaining (the research to enough AGI for UFAI). I expect G->Fremaining < Gremaining, and a relatively low leakage ratio.
after we have AGI->FAI, we have Fremaining (the research for the AGI to input to the AGI->FAI) and Gremaning (the research to enough AGI for UFAI). I expect Fremaining > Gremaining, and furthermore I expect the leakage ratio to be high enough that we are practically guaranteed to have enough AGI capabilities for UFAI before FAI (though I don’t know how long before). Hence the strategic importance of developing AGI capabilities in secret, and not having them lying around for too long in too many hands. I don’t really see a way of avoiding this: the alternative is to have enough research to create FAI but not a paperclip maximizer, which seems implausible (though it would be really nice if we could get this state!).
Also, it seems I had misinterpreted the part about rg and rf, sorry about that!
Good point.
I guess the most controversial, and hopefully false, assumption of this paper is #3: ‘If Gremaining is reached before Fremaining, a UFAI will be created. If after, an FAI will be created.’
This basically is the AI Foom scenario, where the moment an AGI is created, it will either kill us or all or bring about utopia (or both).
If this is not the case, and we have a long time to work with the AGI as it develops to make sure it is friendly, then this model isn’t very useful.
If we do assume these assumptions, I would also expect that we will reach Gremaining before Fremaining, or at least that a private organization will end up doing so. However, I am also very skeptical in the power of secrets. I think I find us reaching Fremaining first more likely than a private institution reaching Gremaining first, but hiding it until it later reaches Fremaining, though both may be very slim. If the US military or a similar group with a huge technological and secretive advantage were doing this, there could be more of a chance. This definitely seems like a game of optimizing small probabilities.
Either way, I think we definitely would agree here that the organization developing these secrets can strategically choose projects that deliver the high amounts of FAI research relative to the amount AGI research they will have to keep secretive. Begin with the easy, non-secretive wins and work from there.
We may need the specific technology to create a paperclip maximizer before we make an FAI, but if we plan correctly, we hopefully will be really close to reaching an FAI by that point.
The question is not “if”. The questions are “how quickly” and “to what height”. An AI capable of self-improving to world-destroying levels within moments is plainly unrealistic. An AI capable of self-improving to dangerous levels (viz: levels where it can start making humans do the dangerous work for it) in the weeks, months, or even years it would take a team of human operators to cross-examine the formally unspecified motivation engines for Friendliness is dangerously realistic.