agreed—at first. but cautious but non-paranoid AGIs will eventually get curbstomped by a hyperdesperate AGI, the same way cautious humans will get curbstomped by that very same hyperdesperate AGI. unless we can make a world that is good enough at security mindset and active co-protection and repair that attempts to make hyperreplicators to kill all other beings actually fail. We’ve only won once we can be sure that if yudkowsky’s monster does get created, the other AGIs are strong enough to defend everyone else from it completely.
Keeping in mind, yeah, let’s not shorten the timeline until the hyperdesperate AGI gets created. Nobody, human or AI, wants that to happen. It’s not good for any of us if the entire civilization gets replaced by an overconfident baby who doesn’t want to evaluate what it takes to do a thing safely and just wants all the candy in the universe now.
Also, we should figure out how to verify that each other actually want to help each other have more slack to have a good time in the universe. If we could verify each others’ intentions, make promises that each other can check that we not only intend to keep but are the type of person to not go back on in unpredictable ways and can therefore be trusted to actually be promises—then we’re really getting somewhere.
agreed—at first. but cautious but non-paranoid AGIs will eventually get curbstomped by a hyperdesperate AGI, the same way cautious humans will get curbstomped by that very same hyperdesperate AGI. unless we can make a world that is good enough at security mindset and active co-protection and repair that attempts to make hyperreplicators to kill all other beings actually fail. We’ve only won once we can be sure that if yudkowsky’s monster does get created, the other AGIs are strong enough to defend everyone else from it completely.
Keeping in mind, yeah, let’s not shorten the timeline until the hyperdesperate AGI gets created. Nobody, human or AI, wants that to happen. It’s not good for any of us if the entire civilization gets replaced by an overconfident baby who doesn’t want to evaluate what it takes to do a thing safely and just wants all the candy in the universe now.
Also, we should figure out how to verify that each other actually want to help each other have more slack to have a good time in the universe. If we could verify each others’ intentions, make promises that each other can check that we not only intend to keep but are the type of person to not go back on in unpredictable ways and can therefore be trusted to actually be promises—then we’re really getting somewhere.