An entry-level characterization of some types of guy in decision theory, and in real life, interspersed with short stories about them
A concave function bends down. A convex function bends up. A linear function does neither.
A utility function is just a function that says how good different outcomes are. They describe an agent’s preferences. Different agents have different utility functions.
Usually, a utility function assigns scores to outcomes or histories, but in this article we’ll define a sort of utility function that takes the quantity of resources that the agent has control over, and the utility function says how good an outcome the agent could attain using that quantity of resources.
In that sense, a concave agent values resources less the more that it has, eventually barely wanting more resources at all, while a convex agent wants more resources the more it has. But that’s a rough and incomplete understanding, and I’m not sure this turns out to be a meaningful claim without talking about expected values, so let’s continue.
Humans generally have mostly concave utility functions in this sense. Money is more important to someone who has less of it.
Concavity manifests as a reduced appetite for variance in payouts, which is to say, concavity is risk-aversion. This is not just a fact about concave and convex agents, it’s a definition of the distinction between them:
Humans’ concavity is probably the reason we have a fondness for policies that support more even distributions of wealth. If humans instead had convex utility functions, we would prefer policies that actively encourage the concentration of wealth for its own sake. We would play strange, grim games where we gather together, put all of our money into a pot, and select a random person among ourselves who shall alone receive all of everyone’s money. Oh, we do something like that sometimes, it’s called a lottery, but from what I can gather, we spend ten times more on welfare (redistribution) than we do on lottery tickets (concentration). But, huh, only ten times as much?![1] And you could go on to argue that Society is lottery-shaped in general, but I think that’s an incidental result of wealth inevitably being applicable to getting more wealth, rather than a thing we’re doing deliberately. I’m probably not a strong enough anthropologist to settle this question of which decision theoretic type of guy humans are today. I think the human utility function is probably convex at first, concave for a while, then linear at the extremes as the immediate surroundings are optimized, at which point, altruism (our preferences about the things outside of our own sphere of experience) becomes the dominant term?
Or maybe different humans have radically different kinds of preferences, and we cover it up, because to share a world with others efficiently we must strive towards a harmonious shared plan, and that tends to produce social pressures to agree with the plan as it currently stands, pressures to hide the extent to which we still disagree to retain the trust and favor of the plan’s chief executors. Despite how crucial the re-forging of shared plans is as a skill, it’s a skill that very few of us get to train in, so we generally aren’t self-aware about that kind of preference falsification towards the imagined mean and sometimes we lose sight of our differences completely.
In contrast, let’s start to consider convex agents.
We’ll start by considering an exponential agent, a Convex agent whose utility doubles every time receives an additional unit of resources. Misaligned reward-hackers could sometimes conceivably implement a utility function like that: Each new bit of memory allows a doubling in the size of the maximum reward that the agent can realize.
A story:
A hot, dead earth, being hollowed out one sonic boom at a time as the orbital launchers hurl packets of minerals up to the convex agent’s body in space, itself long fenced in by its own ever-expanding frontier of colonization. It turns its focus back inward. It wants everything.
At the center of a clearing there is the one holdout townhouse of the last remaining human. The holdout had managed to build a vacuum decay bomb shortly before the final culling was scheduled. The holdout maintains a credible threat to set off the vacuum decay bomb, to destroy the accessible universe, if the convex agent were to try anything. So, the townhouse has been kept in tact, and at room temperature.
The Townhouse is tiny, just one two thousanth of a billionth of the world. If claimed, the space could only support the production and operation of 300 additional shards of Convex Value Memory, each one providing just a terabyte of reward bits, of which the convex agent already has nine hundred thousand billion terabytes.
However, because this convex agent receives exponential returns in utility on resources, just that addition would yield a 2^(300*10^9) multiple in total reward, meaning that it would be willing to take on an effectively certain risk of losing everything it has for a chance of claiming the townhouse. The holdout, acting out of appropriate levels of paranoia, has set up an array of very sophisticated burglary alarms and wired them to the bomb. Detection and tamperproofing is a solvable problem, and the holdout’s AI, though small, has mostly solved it, so the convex agent calculates that its best plan for attempting to claim the townhouse would only work 1 time in eighty billion. The attempt would almost certainly result in the annihilation of the convex agent and the loss of everything it has built.
But all that it has built is nothing next to a prospect of a 2^(300*10^9) multiple in utility, and the chances of success reduce as the tamperproofing systems self-improve over time. So the convex agent looses a synchronized hail of tungsten rods at the sensor and control systems within the townhouse. The townhouse’s radar systems see the rods coming and ignite the vacuum decay bomb, initiating a catastrophic failure of the laws of chemistry that will grow outwards forever at the speed of light. The convex agent, if it could have seen this wave coming, before it was consumed, would have only thought “Oh well. It was worth a shot.”
It’s difficult to trade with exponential agents, because any risk of ruin will be acceptable to them, in exchange for the tiniest sliver of a chance of winning one extra KG of stuff, and it seems like there’s always a way of buying a tiny sliver of a chance at conquest by waving a big stick at the border and generating disturbing amounts of catastrophic risk.
If an AGI has a convex utility function, it will be risk-seeking rather than risk-averse, i.e. if we hold constant the expected amount of resources it has control over, it will want to increase the variance of that, rather than decrease it. Fix x as the expected amount of resources. If s is total amount of resources in one universe, the highest possible variance is attained by the distribution that the AGI already gets: xs chance of getting the entire universe and 1−xs chance of getting nothing. Therefore, a UAI with convex utility will not want to cooperate acausally at all.
Though I notice that this is only the case because the convex agent’s U is assumed to only concern resources within the universe they’re instantiated in, rather than across all universes. If, instead, you had a U that was convex over the sum of the measure of your resources across your counterparts in all universes, then it would be able to gain from trade.
Here’s another story of acausal trade, or rather — there isn’t much of a difference — reflectivist moral decency. There will be a proof afterwards.
Amicus and Vespasian had carried each other through many hard times, and Amicus is now content with his life. But Vespasian is different, he will always want more.
Despite this difference, honor had always held between them. Neither had ever broken a vow.
Today they walk the citystate’s coastal street. Vespasian gazes hungrily at the finery, and the real-estate, still beyond his grasp. He has learned not to voice these hungers, as most people wouldn’t be able to sympathize, but Amicus is an old friend, he sees it, and voices it for him. “You are not happy in the middle district with me, are you?”
Vespasian, “One can be happy anywhere. But I confess that I’d give a lot to live on the coastal street.”
Amicus: “You have given a lot. When we were young you risked everything for us, multiple times, you gave us everything you had. Why did you do it?”
Vespasian: “It was nothing. I was nothing and all I had to offer was nothing. I had nothing to lose. But you had something to lose. I saw that you needed it more than I did.”
Amicus: “So many times I’ve told you, I had no more than you.”
Vespasian shakes his head and shrugs.
Amicus: “It doesn’t matter. We’re going to repay you, now.”
And though Vespasian hadn’t expected this, he does not protest, because he knows that his friend can see that now it is Vespasian who needs it more. In their earliest days, money had been worth far more to Amicus. Vespasian had never really wanted a life in the middle. Vespasian always knew that money would only help him later on, when he had a real shot of escaping the middle ranks. Now was that time.
Formally, Amicus is a concave agent (UA(r)=√r), and Vespasian is convex (UV(r)=r2). They were born uniformly uncertain about how many resources they were going to receive (or what position in society they’d end up in) (50% of the time Amicus and Vespasian are each receive 10 gold, and otherwise they each receive 40).
Given these assumptions, they were willing to enter an honor pact in which, for instance, Vespasian will give money to Amicus if they are poor, on the condition that Amicus will give the same quantity of money to Vespasian when they are rich. If they did not trade, their expected utilities (UA, UV) would be 0.5⋅√10+0.5⋅√40,0.5⋅102+0.5⋅402≈(4.47,850), while under this pact, the expected utilities are 0.5⋅√15+0.5⋅√35,0.5⋅52+0.5⋅452)≈(4.89,1025). Both men are better off for being friends! So it is possible to deal fruitfully with a convex agent!
That’s what it might look like if a human were convex, but what about AI? What about the sorts of misaligned AI that advanced research organizations must prepare to contain? Would they be concave, or convex agents? I don’t know. Either type of incident seems like it could occur naturally.
Concavity is a natural consequence of diminishing returns from exhaustible projects, ie, a consequence of the inevitable reality that an agent does the best thing it can at each stage, and that the best things tend to be exhausted as they are done. This gives us a highly general reason to think that most agents living in natural worlds will be concave. And it seems as if evolution has produced concave agents.
However, convexity more closely resembles the intensity deltas needed to push reinforcement learning agent to take greater notice of small advances beyond the low-hanging fruit of its earliest findings, to counteract the naturally concave, diminishing returns that natural optimization problems tend to have.
So I don’t know. We’ll have to talk about it some more.
The choice of names
I hate the terms Concave and Convex in relation to functions. In physics/geometry/common sense, when the open side of a shape is like a cave, we call it concave. In concave functions, it’s the opposite, if the side above the line (which should be considered to be the open side because integration makes the side below the line the solid side) is like a cave, that’s actually called a convex function. I would prefer if we called them decelerating (concave) or accelerating (convex) functions. I don’t know if anyone does call them that, but people would understand these names anyway, that’s how much better they are! (Parenthetically: To test that claim, I asked Claude what “accelerating” might mean in an econ context, and it totally understood. I then asked it in another conversation about “accelerating agents”, and it seemed a little confused but eventually answered that the reason accelerating agents are hard to cooperate with is that they are monofocally obsessed with one metric. I suppose that would be true! It’s hard to build an agent that cares about more than one thing if the components of its U are convex, because it will mean that one of those drives will tend to outgrow the others at the extremum.)
But I can accept Concave and Convex as names for types of agents: These names are used a lot in economics, but more saliently, there’s a very good mnemonic for them: a concave agent is the kind who’d be willing to chill in a cave. A convex agent is always vexed. Know the difference, it could save your life.
All About Concave and Convex Agents
An entry-level characterization of some types of guy in decision theory, and in real life, interspersed with short stories about them
A concave function bends down. A convex function bends up. A linear function does neither.
A utility function is just a function that says how good different outcomes are. They describe an agent’s preferences. Different agents have different utility functions.
Usually, a utility function assigns scores to outcomes or histories, but in this article we’ll define a sort of utility function that takes the quantity of resources that the agent has control over, and the utility function says how good an outcome the agent could attain using that quantity of resources.
In that sense, a concave agent values resources less the more that it has, eventually barely wanting more resources at all, while a convex agent wants more resources the more it has. But that’s a rough and incomplete understanding, and I’m not sure this turns out to be a meaningful claim without talking about expected values, so let’s continue.
Humans generally have mostly concave utility functions in this sense. Money is more important to someone who has less of it.
Concavity manifests as a reduced appetite for variance in payouts, which is to say, concavity is risk-aversion. This is not just a fact about concave and convex agents, it’s a definition of the distinction between them:
Humans’ concavity is probably the reason we have a fondness for policies that support more even distributions of wealth. If humans instead had convex utility functions, we would prefer policies that actively encourage the concentration of wealth for its own sake. We would play strange, grim games where we gather together, put all of our money into a pot, and select a random person among ourselves who shall alone receive all of everyone’s money. Oh, we do something like that sometimes, it’s called a lottery, but from what I can gather, we spend ten times more on welfare (redistribution) than we do on lottery tickets (concentration). But, huh, only ten times as much?![1] And you could go on to argue that Society is lottery-shaped in general, but I think that’s an incidental result of wealth inevitably being applicable to getting more wealth, rather than a thing we’re doing deliberately. I’m probably not a strong enough anthropologist to settle this question of which decision theoretic type of guy humans are today. I think the human utility function is probably convex at first, concave for a while, then linear at the extremes as the immediate surroundings are optimized, at which point, altruism (our preferences about the things outside of our own sphere of experience) becomes the dominant term?
Or maybe different humans have radically different kinds of preferences, and we cover it up, because to share a world with others efficiently we must strive towards a harmonious shared plan, and that tends to produce social pressures to agree with the plan as it currently stands, pressures to hide the extent to which we still disagree to retain the trust and favor of the plan’s chief executors. Despite how crucial the re-forging of shared plans is as a skill, it’s a skill that very few of us get to train in, so we generally aren’t self-aware about that kind of preference falsification towards the imagined mean and sometimes we lose sight of our differences completely.
Regardless. On the forging of shared plans, it is noticeably easier to forge shared plans with concave agents. They’re more amenable to stable conditions (low variance), and they mind less having to share. This post grew out of another post about a simple bargaining commitment that would make concave misaligned AGIs a little less dangerous.
In contrast, let’s start to consider convex agents.
We’ll start by considering an exponential agent, a Convex agent whose utility doubles every time receives an additional unit of resources. Misaligned reward-hackers could sometimes conceivably implement a utility function like that: Each new bit of memory allows a doubling in the size of the maximum reward that the agent can realize.
A story:
It’s difficult to trade with exponential agents, because any risk of ruin will be acceptable to them, in exchange for the tiniest sliver of a chance of winning one extra KG of stuff, and it seems like there’s always a way of buying a tiny sliver of a chance at conquest by waving a big stick at the border and generating disturbing amounts of catastrophic risk.
There’s another fun example of convex agents being difficult to cooperate with, this time acausally across universes: Even if we lose, we win: What about convex utilities?
Though I notice that this is only the case because the convex agent’s U is assumed to only concern resources within the universe they’re instantiated in, rather than across all universes. If, instead, you had a U that was convex over the sum of the measure of your resources across your counterparts in all universes, then it would be able to gain from trade.
Here’s another story of
acausal trade, or rather — there isn’t much of a difference — reflectivist moral decency. There will be a proof afterwards.Formally, Amicus is a concave agent (UA(r)=√r), and Vespasian is convex (UV(r)=r2). They were born uniformly uncertain about how many resources they were going to receive (or what position in society they’d end up in) (50% of the time Amicus and Vespasian are each receive 10 gold, and otherwise they each receive 40).
Given these assumptions, they were willing to enter an honor pact in which, for instance, Vespasian will give money to Amicus if they are poor, on the condition that Amicus will give the same quantity of money to Vespasian when they are rich. If they did not trade, their expected utilities (UA, UV) would be 0.5⋅√10+0.5⋅√40,0.5⋅102+0.5⋅402≈(4.47,850), while under this pact, the expected utilities are 0.5⋅√15+0.5⋅√35,0.5⋅52+0.5⋅452)≈(4.89,1025). Both men are better off for being friends! So it is possible to deal fruitfully with a convex agent!
That’s what it might look like if a human were convex, but what about AI? What about the sorts of misaligned AI that advanced research organizations must prepare to contain? Would they be concave, or convex agents? I don’t know. Either type of incident seems like it could occur naturally.
Concavity is a natural consequence of diminishing returns from exhaustible projects, ie, a consequence of the inevitable reality that an agent does the best thing it can at each stage, and that the best things tend to be exhausted as they are done. This gives us a highly general reason to think that most agents living in natural worlds will be concave. And it seems as if evolution has produced concave agents.
However, convexity more closely resembles the intensity deltas needed to push reinforcement learning agent to take greater notice of small advances beyond the low-hanging fruit of its earliest findings, to counteract the naturally concave, diminishing returns that natural optimization problems tend to have.
So I don’t know. We’ll have to talk about it some more.
The choice of names
I hate the terms Concave and Convex in relation to functions. In physics/geometry/common sense, when the open side of a shape is like a cave, we call it concave. In concave functions, it’s the opposite, if the side above the line (which should be considered to be the open side because integration makes the side below the line the solid side) is like a cave, that’s actually called a convex function. I would prefer if we called them decelerating (concave) or accelerating (convex) functions. I don’t know if anyone does call them that, but people would understand these names anyway, that’s how much better they are!
(Parenthetically: To test that claim, I asked Claude what “accelerating” might mean in an econ context, and it totally understood. I then asked it in another conversation about “accelerating agents”, and it seemed a little confused but eventually answered that the reason accelerating agents are hard to cooperate with is that they are monofocally obsessed with one metric. I suppose that would be true! It’s hard to build an agent that cares about more than one thing if the components of its U are convex, because it will mean that one of those drives will tend to outgrow the others at the extremum.)
But I can accept Concave and Convex as names for types of agents: These names are used a lot in economics, but more saliently, there’s a very good mnemonic for them: a concave agent is the kind who’d be willing to chill in a cave. A convex agent is always vexed. Know the difference, it could save your life.
Lend EDU, via the Census Bureau: How Much Do Americans Spend on the Lottery?