True, they’re naturally rare in general. The lottery game is a good analogy for the kinds of games they prefer; a consolidation, from many to few, and they can play these sorts of games wherever they are.
I can’t as easily think of a general argument against a misaligned AI ending up convex though.
Well, if you make a convex misaligned AI, it will play the (metaphorical) lottery over and over again until 99.9999%+ of the time it has no power and resources left whatsoever. The smarter it is, the faster and more efficient it will be at achieving this outcome.
So unless the RNG gods are truly out to get you, in the long run you are exceedingly unlikely to actually encounter a convex misaligned AI that has accumulated any real amount of power.
Mm on reflection, the Holdout story glossed over the part where the agent had to trade off risk against time to first intersolar launch (launch had already happened). I guess they’re unlikely to make it through that stage. Accelerating cosmological expansion means that we lose, iirc, 6 stars every day we wait before setting out. The convex AGI knows this, so even in its earliest days it’s plotting and trying to find some way to risk it all to get out one second sooner. So I guess what this looks like is it says something totally feverish to its operators to radicalize them as quickly and energetically as possible, messages that’ll tend to result in a “what the fuck, this is extremely creepy” reaction 99% of the time.
But I guess I’m still not convinced this is true with such generality that we can stop preparing for that scenario. Situations where you can create an opportunity to gain a lot by risking your life might not be overwhelmingly common, given the inherent tension between those things (usually, safeguarding your life is an instrumental goal), and given that risking your life is difficult to do once you’re a lone superintelligence with many replicas.
True, they’re naturally rare in general. The lottery game is a good analogy for the kinds of games they prefer; a consolidation, from many to few, and they can play these sorts of games wherever they are.
I can’t as easily think of a general argument against a misaligned AI ending up convex though.
Well, if you make a convex misaligned AI, it will play the (metaphorical) lottery over and over again until 99.9999%+ of the time it has no power and resources left whatsoever. The smarter it is, the faster and more efficient it will be at achieving this outcome.
So unless the RNG gods are truly out to get you, in the long run you are exceedingly unlikely to actually encounter a convex misaligned AI that has accumulated any real amount of power.
Mm on reflection, the Holdout story glossed over the part where the agent had to trade off risk against time to first intersolar launch (launch had already happened). I guess they’re unlikely to make it through that stage.
Accelerating cosmological expansion means that we lose, iirc, 6 stars every day we wait before setting out. The convex AGI knows this, so even in its earliest days it’s plotting and trying to find some way to risk it all to get out one second sooner. So I guess what this looks like is it says something totally feverish to its operators to radicalize them as quickly and energetically as possible, messages that’ll tend to result in a “what the fuck, this is extremely creepy” reaction 99% of the time.
But I guess I’m still not convinced this is true with such generality that we can stop preparing for that scenario. Situations where you can create an opportunity to gain a lot by risking your life might not be overwhelmingly common, given the inherent tension between those things (usually, safeguarding your life is an instrumental goal), and given that risking your life is difficult to do once you’re a lone superintelligence with many replicas.
Most goals humans want you to achieve require concave-agent-like behaviors perhaps?