Thankfully, almost all of the time the convex agents end up destroying themselves by taking insane risks to concentrate their resources into infinitesimally likely worlds, so you will almost never have to barter with a powerful one.
(why not just call them risk seeking / risk averse agents instead of convex/concave?)
I see the intuition here, but I think the actual answer on how convex agents behave is pretty messy and complicated for a few reasons:
Otherwise convex agents might act as though resources are bounded. This could be because they assign sufficiently high probability to literally bounded universes or because they think that value should be related to some sort of (bounded) measure
More generally, you can usually only play lotteries if there is some agent to play a lottery with. If the agent can secure all the resources that would otherwise be owned by all other agents, then there isn’t any need for (further) lotteries. (And you might expect convex agents to maximize the probability of this sort of outcome.)
Convex agents (and even linear agents) might be dominated by some possiblity of novel physics or similarly large breakthrough opening up massive amounts of resources. In this case, it’s possible that the optimal move is something like securing enough R&D potential (e.g. a few galaxies of resources) that you’re well past diminishing returns on hitting this, but you don’t necessarily otherwise play lotteries. (It’s a bit messier if there is competition for the resources opened up by novel physics.
True, they’re naturally rare in general. The lottery game is a good analogy for the kinds of games they prefer; a consolidation, from many to few, and they can play these sorts of games wherever they are.
I can’t as easily think of a general argument against a misaligned AI ending up convex though.
Well, if you make a convex misaligned AI, it will play the (metaphorical) lottery over and over again until 99.9999%+ of the time it has no power and resources left whatsoever. The smarter it is, the faster and more efficient it will be at achieving this outcome.
So unless the RNG gods are truly out to get you, in the long run you are exceedingly unlikely to actually encounter a convex misaligned AI that has accumulated any real amount of power.
Mm on reflection, the Holdout story glossed over the part where the agent had to trade off risk against time to first intersolar launch (launch had already happened). I guess they’re unlikely to make it through that stage. Accelerating cosmological expansion means that we lose, iirc, 6 stars every day we wait before setting out. The convex AGI knows this, so even in its earliest days it’s plotting and trying to find some way to risk it all to get out one second sooner. So I guess what this looks like is it says something totally feverish to its operators to radicalize them as quickly and energetically as possible, messages that’ll tend to result in a “what the fuck, this is extremely creepy” reaction 99% of the time.
But I guess I’m still not convinced this is true with such generality that we can stop preparing for that scenario. Situations where you can create an opportunity to gain a lot by risking your life might not be overwhelmingly common, given the inherent tension between those things (usually, safeguarding your life is an instrumental goal), and given that risking your life is difficult to do once you’re a lone superintelligence with many replicas.
Thankfully, almost all of the time the convex agents end up destroying themselves by taking insane risks to concentrate their resources into infinitesimally likely worlds, so you will almost never have to barter with a powerful one.
(why not just call them risk seeking / risk averse agents instead of convex/concave?)
I see the intuition here, but I think the actual answer on how convex agents behave is pretty messy and complicated for a few reasons:
Otherwise convex agents might act as though resources are bounded. This could be because they assign sufficiently high probability to literally bounded universes or because they think that value should be related to some sort of (bounded) measure
More generally, you can usually only play lotteries if there is some agent to play a lottery with. If the agent can secure all the resources that would otherwise be owned by all other agents, then there isn’t any need for (further) lotteries. (And you might expect convex agents to maximize the probability of this sort of outcome.)
Convex agents (and even linear agents) might be dominated by some possiblity of novel physics or similarly large breakthrough opening up massive amounts of resources. In this case, it’s possible that the optimal move is something like securing enough R&D potential (e.g. a few galaxies of resources) that you’re well past diminishing returns on hitting this, but you don’t necessarily otherwise play lotteries. (It’s a bit messier if there is competition for the resources opened up by novel physics.
Infinite ethics.
True, they’re naturally rare in general. The lottery game is a good analogy for the kinds of games they prefer; a consolidation, from many to few, and they can play these sorts of games wherever they are.
I can’t as easily think of a general argument against a misaligned AI ending up convex though.
Well, if you make a convex misaligned AI, it will play the (metaphorical) lottery over and over again until 99.9999%+ of the time it has no power and resources left whatsoever. The smarter it is, the faster and more efficient it will be at achieving this outcome.
So unless the RNG gods are truly out to get you, in the long run you are exceedingly unlikely to actually encounter a convex misaligned AI that has accumulated any real amount of power.
Mm on reflection, the Holdout story glossed over the part where the agent had to trade off risk against time to first intersolar launch (launch had already happened). I guess they’re unlikely to make it through that stage.
Accelerating cosmological expansion means that we lose, iirc, 6 stars every day we wait before setting out. The convex AGI knows this, so even in its earliest days it’s plotting and trying to find some way to risk it all to get out one second sooner. So I guess what this looks like is it says something totally feverish to its operators to radicalize them as quickly and energetically as possible, messages that’ll tend to result in a “what the fuck, this is extremely creepy” reaction 99% of the time.
But I guess I’m still not convinced this is true with such generality that we can stop preparing for that scenario. Situations where you can create an opportunity to gain a lot by risking your life might not be overwhelmingly common, given the inherent tension between those things (usually, safeguarding your life is an instrumental goal), and given that risking your life is difficult to do once you’re a lone superintelligence with many replicas.
Most goals humans want you to achieve require concave-agent-like behaviors perhaps?
Any SBF enjoyers?