I don’t see how your scenario addresses the statement “Taking over the lightcone is the default behavior”. Yes, it’s obvious that you can build an AGI and then destroy it before you turn it on. You can also choose to just not build one at all with no coin flip. There’s also the objection that if you destroy it before you turn it on, have you really created an AGI, or just something that potentially might have been an AGI?
It also doesn’t stop other people from building one. If theirs destroys all human value in the future lightcone by default, then you still have just as big a problem.
I don’t see how your scenario addresses the statement “Taking over the lightcone is the default behavior”. Yes, it’s obvious that you can build an AGI and then destroy it before you turn it on. You can also choose to just not build one at all with no coin flip. There’s also the objection that if you destroy it before you turn it on, have you really created an AGI, or just something that potentially might have been an AGI?
It also doesn’t stop other people from building one. If theirs destroys all human value in the future lightcone by default, then you still have just as big a problem.
I don’t see why all possible ways for AGI to critically fail to do what we have build it for must involve taking over the lightcone.
So let’s also blow up the Earth. By that definition the alignment would be solved.