I think it is very unlikely that they need so much time as to make it viable to solve AI Alignment by then.
Edit: Looking at the rest of the comments, it seems to me like you’re under the (false, I think) impression that people are confident a superintelligence wins instantly? Its plan will likely take time to execute. Just not any more time than necessary. Days or weeks, it’s pretty hard to say, but not years.
We should make a poll or something. I find that people thinking that it will take years are getting it wrong because they are not considering that in the meantime we can build other machines. People thinking it will take days or weeks are underestimating how easy is killing humans
I don’t understand precisely what question you’re asking. I think it’s unlikely we will happen to solve alignment by any method in the time frame between an AGI going substantially superhuman and the AGI causing doom.
I think we get to to the bottom of the disagreement. You think that an AGI would be capable of killing humans in days or weeks, and I think it wouldn’t. As I think it would take at least months (but more likely years) for an AGI to go to a position where it can kill humans, I think it is possible to make other AGIs in the meantime and coerce some of them into solving the alignment problem/fighting rogue AGIs.
So now, we can discuss about why I think it would take years rather than days. My model of the world is one where you CAN cause a great harm in a short amount of time, but I don’t think it is possible and I haven’t seen any evidence so far, that we live in a world where an entity with bounded computational capabilities successfully implements a plan that kills all humans without incurring into great risks to itself. I am sorry I can’t give more details but I cannot really prove a negative. I can only come up with examples like: if you said to me that you have a plan to make Chris Rock go have a threesome with Will Smith and Donald Trump, I wouldn’t tell you it is physically impossible, but I would be automatically skeptical.
Even if it takes years, the “make another AGI to fight them” step would… require solving the alignment problem? So it would just give us some more time, and probably not nearly enough time.
We could shut off the internet/all our computers during those years. That would work fine.
I think it is very unlikely that they need so much time as to make it viable to solve AI Alignment by then.
Edit: Looking at the rest of the comments, it seems to me like you’re under the (false, I think) impression that people are confident a superintelligence wins instantly? Its plan will likely take time to execute. Just not any more time than necessary. Days or weeks, it’s pretty hard to say, but not years.
We should make a poll or something. I find that people thinking that it will take years are getting it wrong because they are not considering that in the meantime we can build other machines. People thinking it will take days or weeks are underestimating how easy is killing humans
I see, why have a crux them. How quickly do you think an AGI would need to solve the alignment problem?
I am deducing that you think:
Time(alignment) > time(doom)
I don’t understand precisely what question you’re asking. I think it’s unlikely we will happen to solve alignment by any method in the time frame between an AGI going substantially superhuman and the AGI causing doom.
I think we get to to the bottom of the disagreement. You think that an AGI would be capable of killing humans in days or weeks, and I think it wouldn’t. As I think it would take at least months (but more likely years) for an AGI to go to a position where it can kill humans, I think it is possible to make other AGIs in the meantime and coerce some of them into solving the alignment problem/fighting rogue AGIs.
So now, we can discuss about why I think it would take years rather than days. My model of the world is one where you CAN cause a great harm in a short amount of time, but I don’t think it is possible and I haven’t seen any evidence so far, that we live in a world where an entity with bounded computational capabilities successfully implements a plan that kills all humans without incurring into great risks to itself. I am sorry I can’t give more details but I cannot really prove a negative. I can only come up with examples like: if you said to me that you have a plan to make Chris Rock go have a threesome with Will Smith and Donald Trump, I wouldn’t tell you it is physically impossible, but I would be automatically skeptical.
Even if it takes years, the “make another AGI to fight them” step would… require solving the alignment problem? So it would just give us some more time, and probably not nearly enough time.
We could shut off the internet/all our computers during those years. That would work fine.