Worked Examples of Shapley Values
Three times in the past month, I’ve run across occasions where Shapley values were mentioned or would have been useful. There are a couple of good explainers of Shapley value already on the internet, but what most of them lack is a bunch of worked examples. I find that many times, the limiting factor in my ability to understand a concept is whether or not I’ve been exposed to multiple examples—so this post is where I’ll be posting some of the examples I worked through to understand how Shapley value works.
I’ll start with a brief overview of Shapley value, but first, a caveat. I think it’s fine to skim most of the math in the overview. We won’t really be using it that much, and I try to give an intuitive sense of what the equations mean before I present them. You could probably skip the entire overview and just go straight to the examples and not miss much.
Overview of Shapley Value
Suppose you have a group of people who work together to produce some sort of value (traditionally profit). How should you divide the credit for that value (or the actual profit) up among the individuals involved? One immediate suggestion might be “equally”, but that doesn’t necessarily satisfy intuitive notions of fairness. If a doctor performs life-saving surgery, should they receive equal credit for saving someone’s life as the random person holding the door open for the doctor at the end of the surgery? If you’re running a restaurant, and three servers spend all night running around serving people, and one server spends two hours on a smoke break, do all four deserve an equal amount of pay?
Shapley value provides a different way of computing what a fair division of value should be. Why use Shapley value? Well, like splitting things equally, it (a) divides all of the gains among all of the participants, (b) splits things equally between participants who contribute the same value, and (c) if there are two completely independent value-producing processes, then the assignment of value to each participant is equal to the sum of the value for that participant in each game[1]. Unlike splitting things equally, it also (d) assigns no value to anyone who always contributes nothing, and in fact, it is the only assignment rule which satisfies all four of those constraints.
The formulas for computing the Shapley values can be found on its Wikipedia page.
The simplest way of computing Shapley value, the one we’ll be using for most of this post, is to consider the synergy of a group. (Here we use synergy in the original sense, not in the buzzword-ified sense. If you’ve never heard it used as anything besides a buzzword, it means “the value of a group that is greater than the sum of its parts”.) We’ll follow Wikipedia’s convention and use v(S) to notate the value produced by a set S, and w(S) to notate the synergy of some set of participants S. I’ll break with Wikipedia’s convention and use V(i) to indicate the Shapley value assigned to participant i (or subgroup R).
(Slightly obscure mathematical notation explainer footnote: [2])
The equation for synergy is
but I think in practice, it’s easy for many questions to simply determine the synergy of a group by asking, “what value does this group provide that isn’t already accounted for by smaller groups contained within this one?”
Once we have the synergy, we can find the Shapley value of each participant by considering the synergy of each group and dividing the synergy equally among participants. (In a way, we’ve gone one level up; instead of dividing the value produced equally, we divide the synergy equally.) Here it is, written in equation form
but I think that as we get into the examples, it will become much clearer.
Examples
Simple Factory Business
The most basic example, lifted straight from the Wikipedia page, is the “business” example. It works like this: suppose a group of workers and a factory owner want to start a company. The factory owner can provide a factory, but sitting empty, it provides nothing. In equation form,
Likewise, a group of workers on their own, but with no factory to work in, can make nothing.
However, when the workers and the owner work together, each worker can use the factory to make some amount of profit.
If we think about the synergy of this arrangement, the workers have no synergy with each other; a group of workers doesn’t provide anything more than exactly that many workers working alone, either with or without the owner involved. But there is synergy between each worker and the owner. That synergy is the profit each worker can make by working in the factory.
(The synergy of all other subsets is 0.)
From this, we can determine the Shapley value for each worker is half of the profit they produce, because we equally split the synergy p above between the owner and the worker. The Shapley value for the owner is half the profit of each worker, times the number of workers.
Plugging in some numbers, let’s say concretely that there are 10 workers, and each one can produce $500 of profit a day by working in the factory.
In that case, each worker keeps $250 per day, and the owner keeps $2,500 each day.
A lot of the examples of Shapley value I’ve seen online stop there, or go into more obscure applications. But I want to dig deeper, and un-sphereify this cow. The above example makes some strange assumptions, which can go unlooked at first glance.
For example, it supposes that the owner would rather make no money than work at their own factory. It’s reasonable to suppose that once the owner makes $2,500 a day from the capital, they wouldn’t bother laboring as well, but that’s not a reasonable assumption if no one else is working at the factory.[3]
Additionally, the above states that each worker has no other available options for producing any profit, but in most cases, the workers would have other options—they could go work a worse job. Once again, the details of the other jobs available affect the Shapley value, just like the owner’s option to work in the factory does.
Workers Have Other Jobs
If we suppose instead that each worker can produce x value each day by working alone, doing something that doesn’t require any capital, that massively changes the synergy equation. Now, instead of
we have
(The value of w({owner, worker}) changes because there’s less synergy, even though the value produced is the same.)
Now, when we add up the synergy for each worker, they keep ALL of the synergies from working alone, plus half of the synergy of working at the factory, for a total of x + (p-x)/2 = p/2 + x/2 This is true even if x is less than half of p; even though few workers would choose to make less, just having it as an option changes the Shapley value of the participant.
It also drives the owner’s total take down, from np/2 to n(p-x)/2. This effectively creates a sliding scale; as the value of what each worker could do without the factory goes up, if the value of the worker working at the factory stays the same, the share allocated to the worker of that work increases (at least, according to Shapley value).
Plugging in numbers from a few possible scenarios, using the same baseline of $500 profit produced per worker at the factory and 10 total workers, as above:
Solo Worker Value | Worker at Factory Shapley Value | Owner at Factory Shapley Value |
$10 | $255 | $2450 |
$100 | $300 | $2000 |
$200 | $350 | $1500 |
$300 | $400 | $1000 |
$400 | $450 | $500 |
~$409.09 | ~$454.55 | ~$454.55 |
There are some more examples I want to explore, but they start to get complicated enough that I don’t trust myself to get the algebra correct. Instead, I’m going to switch to using a computer program to compute the Shapley value for the participants. You can see and run the program I’ve written here, but it’s not required to understand what happens in the scenarios below.
Minimum Number of Workers
What happens if we say that the factory requires a minimum of two workers before any profit is generated, but besides that everything stays the same as the initial problem?. In this case, we have
for the value function. Using the concrete values of 5 workers and $500 profit per worker, we get $2454.55 in value for the owner and $254.55 for each worker.
What happens as we scale up the minimum number of workers needed to work?
Minimum Number of Workers Needed for Operation | Shapley Value of Worker | Shapley Value of Owner |
1 | $250 | $2500 |
2 | $254.55 | $2454.55 |
3 | $263.64 | $2363.64 |
4 | $277.27 | $2227.27 |
5 | $295.45 | $2045.45 |
6 | $318.18 | $1818.18 |
7 | $345.45 | $1545.45 |
8 | $377.27 | $1227.27 |
9 | $413.64 | $863.64 |
10 | $454.55 | $454.55 |
Manager or CEO
What about management? Suppose we have a manager/CEO in addition to the owners and the workers. Without the CEO, the factory runs normally, producing $500 of profit per worker. With the CEO, the factory is more efficient and produces $600 instead profit. Plugging this into our program, we can see that the owner receives $2583.33 of the profit, the CEO receives $583.33 of the profit, and each worker receives $283.33 of the profit.
This is sort of like the “workers have other jobs available” scenario, in that we can adjust the amount added by the presence of the CEO to see how wages adjust
CEO Value Add | Worker Shapley Value | Owner Shapley Value | CEO Shapley Value |
0 | $250 | $2500 | $0 |
$10 | $253.33 | $2533.33 | $33.33 |
$100 | $283.33 | $2833.33 | $333.33 |
$500 | $416.66 | $4166.66 | $1666.66 |
$1000 | $583.33 | $5833.33 | $3333.33 |
Interestingly, under this model, there’s no value for which the CEO generates more Shapley value than the Owner, even if you posit that the presence of the CEO increases the profit per worker by 100 times.
Maximum Number of Workers
What if we posit a maximum number of workers? Suppose that there are 10 assembly stations in the factory, but 11 available workers (and everything else is equivalent to the initial scenario). In this case, there’s no clear way to determine which workers get to work and which don’t have a job, but if we compute the Shapley value, we can see that the addition of an unemployable worker causes the owner’s Shapley value to rise to $2708.33 (from $2500), and the worker’s Shapley values to drop to $208.33 (from $250).
This is strange to me, in the way that something just on the edge of the familiar is strange. It rhymes with the notion that in a free market, an increase in supply lowers the price (demand being constant), but in this case, we have a worker who does not work, who contributes no actual value but who still is assigned Shapley value due to counterfactual value. It also rhymes with unemployment welfare, in the sense that even those who cannot work (due to a lack of jobs) are given some money, but in this case, they’re given the same pay as their employed peers, and the money comes primarily from their peer’s pay and causes an increase in the owner’s pay.
Takeaways
There are a couple of other examples I might run through my program at some other time, and feel free to comment and request the Shapley values for a particular arrangement. But for now, I think I’ve gotten a good enough feel for how Shapley value changes in some common circumstances.
What practical takeaways am I getting from this write-up? Well, if we interpret the Shapley value as the way people should be paid when working as companies, it implies:
When your company hires someone in your management chain, if they’re paying them more than the previous manager (or adding a new manager), you also deserve a pay raise. (And the reverse is true!)
If you’re trying to hire two people for the same position, and you can’t decide who to hire, you should pay them both the same amount, even if you only hire one person.
Even if you’re in the best-paying job available to you, new opportunities below your current salary should result in you being paid more at your current job.
Or, perhaps the takeaway is that Shapley value is not a great way of actually distributing value in a corporate environment. One of the weird things about the above experiments is how there is some sort of spooky action at a distance in Shapley value; changing circumstances that don’t change the total value generated can change how that value is distributed. It’s entirely possible that measuring the Shapley value of employees at a corporation is significantly different than measuring the Shapley value of the same person inside of the greater economy.
- ^
This is a bit of a simplification: It’s really just linear, which implies a few additional things, but I think this gets at the core of why we want linearity.
- ^
This
means “for each subset of S, which we name R, compute f(R), and sum together the results”
This
is a notation for the indicator function; which returns 1 if x ∈ R, and 0 otherwise.
This
is the notation for the cardinality of—the number of elements in—a set.
- ^
The scenario where the owner always works is much simpler, and not super interesting, but is a good exercise to do to make sure you’ve understood the algebra, so I’ll refrain from doing that one.
Solution:
The owner gets to keep all of the extra profit from their own work, and it doesn’t impact anyone else
- Unifying Bargaining Notions (1/2) by 25 Jul 2022 0:28 UTC; 209 points) (
- 27 Jan 2023 20:29 UTC; 5 points) 's comment on Assigning Praise and Blame: Decoupling Epistemology and Decision Theory by (
I think the unemployed worker example is the following: If I’m an unemployed worker, I tell the owner I’ll work for $X-1 dollars rather than the current worker who is working for $X dollars. The owner accepts, of course. So to keep that from happening, the worker offers me $Y to not work and let him keep the job. I go around to all the other workers and makes the same argument, getting Y from each. At the end of the day—viola I’m making 10Y dollars to not work and all the workers are making X—Y dollars and I’ll refuse any Y up until 10Y = X—Y and we make the same amount (otherwise I’ll undercut them).
That’s not enough to replicate the result since it doesn’t prescribe the value the owner gets.
Presumably in real life we can somehow include the negative term for the downsides of employment and the unemployed will accept a lower pay? Also, it’s interesting to think of unemployment benefits as “paying people to not compete with me in the job market” but I guess that kind of makes sense.
I think it’s not necessarily the case that free-market pairwise bargaining always leads to the Shapley value. 10Y = X -Y has an infinite number of solutions, and the only principled ways I know of for choosing solutions is either Shapley value or the fact that in this scenario, since there are no other jobs, the owner should be able to negotiate X and Y down to epsilon.
It looks like Shapley values satisfy an equilibrium property that should take into account more than just pairwise bargaining. Specifically, there is no subset of participants that can gain more than the Shapley values by excluding the others (assuming that v satisfies [superadditivity](https://en.wikipedia.org/wiki/Shapley_value#Stand-alone_test), i.e. a group is always at least as valuable as it’s subsets individually added together). We can prove this:
∑i∈RvS(i)=∑i∈R∑R′⊂S1R′(s)w(R′)/|R′|≥∑i∈R∑R′⊂R1R′(i)w(R′)/|R′|First, by induction see that ∑R⊂Sw(R)=v(S) for any S. And by superadditivity, w(S)≥0 for all S. Then we can do:
So then ∑i∈RvS(i)≥∑R′⊂Rw(R′)=∑R′⊂Rw(R′)=v(R).
That means that the total value produced by the subset R is going to be less than (or equal to) the total of the Shapley values they obtain from participating in the whole group. Therefore, they can’t possibly all profit by excluding anyone since there’s not enough profit to go around. Presumably this is well known and has a name. It’s basically a direct extension of the ‘stand-alone test’ that Wikipedia lists, so maybe it’s the ‘stand-together test’?
So that makes me think Shapley values are what you might get after multi-party bargaining arrives at equilibrium. This a pretty amazing topic, and great selections of examples to explore!
Superadditivity seems rare in practice. For instance, workers should have subadditive contributions after some point. This is certainly true in the unemployment example in the post.
Perhaps there is a different scheme for dividing gains from coöperation which satisfies some of the things we want but not superadditivity, but I’m unfamiliar with one. Please let me know if you find anything in that vein, I’d love to read about some alternatives to Shapley Value.
“the addition of an unemployable worker causes … the worker’s Shapley values to drop to $208.33 (from $250).”
I would emphasize here that the “workers’” includes the unemployed one. It was not obvious to me, until about halfway through the next paragraph, and I think the next paragraph would read better with that in mind from the start.
Great explanation! I’d love to see what happens if you take into account e.g. all the construction workers whose labor built the factory, all the law enforcement officers whose labor protects
the unnatural farce that isprivate ownership of the means of production, etc. The share of value rightfully belonging to the “owner” would get smaller and smaller if you factor in all the realities of the situation.The owner already fully owns the factory in the beginning of these scenarios, so the results of any previous distribution with construction companies is already factored in.
If you start including various other services the factory requires to operate, those also obviously reduce the value share going to workers.
If you have one position and two potential hires Shapley Values doesn’t say You the Employer had to pay them the same. You hire one, and another business hire the other. Together you and the other business “should” pay the Shapley Values.
If you apply Shapley value to a situation where there are more workers than jobs, regardless of how many businesses the jobs are split between, people who can’t get jobs still have a nonzero Shapley value based on their counterfactual contribution to the enterprises they could be working at.
Sure. That seems very sensible to me. I don’t see the problem?