Like, just because there is any chance of a variable becoming disconnected in a principal-agent problem doesn’t mean that it’s always a bad idea to incentivize intermediary metrics. I am not fully sure how to understand your point as anything besides “never incentivize any lead-metrics whatsoever, only ever incentivize successful output”, which seems like a recipe for sparse reward landscapes, and also not a common practice in almost any domain in which humans deal with principal-agent problems.
Your employer pays you if you show up for work, not only if you successfully get work done (at least on the day-to-day or month-to-month level). You pay your plumber if they show up, not only if they successfully fix your toilet.
Like, if you see a friend taking an action that you know and they know has a 50% chance of making $10 for you and your friend (let’s say for a communal club) and a 50% chance of losing $5, and then turns out they lose $5, then it seems better to still reward your friend for taking that action, instead of punishing them, given that you know the action had positive expected value.
(assuming you have mostly linear value of money at these stakes)
If they think the odds are 90% $10 and 10% −5$ and you think the odds are 10% $10 and 90% −5$ should you reward for trying to benefit or punish for having wrong beliefs that materially matter?
No, because humans are risk-averse, at least in money terms, but also in most other currencies. If you do this, you increase the total risk for your friend, for no particular gain.
Punishment is also usually net-negative, whereas rewards tend to be zero-sum, so by adding a bunch of worlds where you added punishments, you destroyed a bunch of value, with no gain (in the world where you both have certainty about the payoff matrix).
One model here is that humans have diminishing returns on money, so in order to reward someone 2x with dollars, you have to pay more than 2x the dollar amount, so your total cost is higher.
A scenario with zero-sum actions and net-negative actions can only go downhill. This would seem to imply that if you have an opportunity to give feedback or not give feedback you should opt to get a guaranteed zero rather than risk destroying value.
Rewards are usually a transfer of resources (e.g. me giving you money), which tend to preserve total wealth (or status, or whatever other resource you are thinking about).
Unilateral punishments are usually not transfers of resource, they are usually one party imposing a cost on another party (like hitting them with a stick and injuring them), in a way that does not preserve total wealth (or health, or whatever other resource applies to the situation).
You certainly shouldn’t hit your friend with a stick if he loses $5 of your club’s money. I think this is fairly obvious, and it seems quite improbable that you were assuming that I was suggesting any such thing. So, given that we can’t possibly be talking about injuring anyone, or doing any such thing, how can your point about net-negative punishment apply? The more sensible assumption is that the punishment is of the same kind as the reward.
I think social punishments usually have the same form. Where rewards tend to be more of a transfer of status, and punishments more of a destruction of status (two people can destroy each others reputation with repeated social punishments).
There is also the bandwidth cost of punishment, as well as the simple fact that giving people praise usually comes with a positive emotional component for the receiver (in addition to the status and the reputation), whereas punishments usually come with an addition of stress and discomfort that reduces total output for a while.
In either case, I think the simpler case is made by simply looking at the assumption of diminishing returns in resources and realizing that the cost of giving someone a reward they care 2x about is usually larger than the cost of giving the reward twice, meaning that there is an inherent cost to high-variance reward landscapes.
Your employer pays you if you show up for work, not only if you successfully get work done (at least on the day-to-day or month-to-month level).
If you show up, but don’t get work done, you get fired. (How quickly that happens varies from workplace to workplace, of course—but in many places it happens very quickly indeed.)
Yeah, but the fact that it takes a while and we have monthly wages instead of just all being contractors that are paid by the piece is kind of my point. Most of the economy does not pay for completed output, but for intermediary metrics that allow a much higher-level of stability.
But note that even if you don’t get fired immediately for failing to produce satisfactory work, you are likely to receive a dressing-down from your boss, poor evaluations, etc., or even something so simple as your team leader being visibly disappointed with you, even if they take no immediate action.
Now consider what that analogizes to, in the case at hand. Is a downvote, or a critical comment, more like being fired, or more like your boss telling you that your work isn’t up to par and that you should really try to do better?
My experience is definitely the opposite. Random Quora question also suggests that it’s common practice in plumbing to pay someone for the attempt, not for the solution. As someone who recently hired plumbers and electricians to fix a bunch of stuff in a new house we rented, this also matches with my experience. Not sure where your experience comes from.
In general, most contractors bill by the hour, not for completed output, and definitely not “output that the client thinks is worth it”, at least in my experience (there are obviously exceptions, though I found them relatively rare).
My experience comes from the same sort of thing: having, on many occasions, hired various people to do various sorts of work; and also from having worked for several years working at a computer store that specialized in on-the-premises repair/service.
The Quora answer you linked doesn’t really support your point, as it’s quite clear about the prerequisite being an informed, explicit agreement between plumber and customer that the latter will pay the former regardless of outcome. (And even with that caveat, some of what the answer-giver says is suspect, and is not consistent with my experience.)
I do not know of any industry in which contractor agreements with variable payments that are dependent on the quality of the output are common practice. There is often an agreement on what it means to “complete the work” but in almost any case both your downside and your upside are limited by a guaranteed upfront payment, and a conditional final payment. But it’s almost never the case that you can get 2x the money depending on the quality of your output, which seems like a necessary requirement for some of the incentive schemes you outlined.
What does this have to do with anything? You originally said:
Youu pay your plumber if they show up, not only if they successfully fix your toilet.
I don’t see the connection between “should you pay your plumber even if they don’t actually fix your toilet” and “should you pay your plumber twice as much if they fix your toilet twice as well”; the latter seems like a nonsensical question, and unrelated to the former.
I… am confused?
Like, just because there is any chance of a variable becoming disconnected in a principal-agent problem doesn’t mean that it’s always a bad idea to incentivize intermediary metrics. I am not fully sure how to understand your point as anything besides “never incentivize any lead-metrics whatsoever, only ever incentivize successful output”, which seems like a recipe for sparse reward landscapes, and also not a common practice in almost any domain in which humans deal with principal-agent problems.
Your employer pays you if you show up for work, not only if you successfully get work done (at least on the day-to-day or month-to-month level). You pay your plumber if they show up, not only if they successfully fix your toilet.
Like, if you see a friend taking an action that you know and they know has a 50% chance of making $10 for you and your friend (let’s say for a communal club) and a 50% chance of losing $5, and then turns out they lose $5, then it seems better to still reward your friend for taking that action, instead of punishing them, given that you know the action had positive expected value.
(assuming you have mostly linear value of money at these stakes)
If they think the odds are 90% $10 and 10% −5$ and you think the odds are 10% $10 and 90% −5$ should you reward for trying to benefit or punish for having wrong beliefs that materially matter?
You should punish your friend for the loss, and reward them (twice as much) for a win. This creates the correct incentives.
No, because humans are risk-averse, at least in money terms, but also in most other currencies. If you do this, you increase the total risk for your friend, for no particular gain.
Punishment is also usually net-negative, whereas rewards tend to be zero-sum, so by adding a bunch of worlds where you added punishments, you destroyed a bunch of value, with no gain (in the world where you both have certainty about the payoff matrix).
One model here is that humans have diminishing returns on money, so in order to reward someone
2x
with dollars, you have to pay more than 2x the dollar amount, so your total cost is higher.A scenario with zero-sum actions and net-negative actions can only go downhill. This would seem to imply that if you have an opportunity to give feedback or not give feedback you should opt to get a guaranteed zero rather than risk destroying value.
Could you elaborate on this? I’m not at all sure what this is referring to.
Rewards are usually a transfer of resources (e.g. me giving you money), which tend to preserve total wealth (or status, or whatever other resource you are thinking about).
Unilateral punishments are usually not transfers of resource, they are usually one party imposing a cost on another party (like hitting them with a stick and injuring them), in a way that does not preserve total wealth (or health, or whatever other resource applies to the situation).
You certainly shouldn’t hit your friend with a stick if he loses $5 of your club’s money. I think this is fairly obvious, and it seems quite improbable that you were assuming that I was suggesting any such thing. So, given that we can’t possibly be talking about injuring anyone, or doing any such thing, how can your point about net-negative punishment apply? The more sensible assumption is that the punishment is of the same kind as the reward.
I think social punishments usually have the same form. Where rewards tend to be more of a transfer of status, and punishments more of a destruction of status (two people can destroy each others reputation with repeated social punishments).
There is also the bandwidth cost of punishment, as well as the simple fact that giving people praise usually comes with a positive emotional component for the receiver (in addition to the status and the reputation), whereas punishments usually come with an addition of stress and discomfort that reduces total output for a while.
In either case, I think the simpler case is made by simply looking at the assumption of diminishing returns in resources and realizing that the cost of giving someone a reward they care 2x about is usually larger than the cost of giving the reward twice, meaning that there is an inherent cost to high-variance reward landscapes.
If you show up, but don’t get work done, you get fired. (How quickly that happens varies from workplace to workplace, of course—but in many places it happens very quickly indeed.)
Yeah, but the fact that it takes a while and we have monthly wages instead of just all being contractors that are paid by the piece is kind of my point. Most of the economy does not pay for completed output, but for intermediary metrics that allow a much higher-level of stability.
But note that even if you don’t get fired immediately for failing to produce satisfactory work, you are likely to receive a dressing-down from your boss, poor evaluations, etc., or even something so simple as your team leader being visibly disappointed with you, even if they take no immediate action.
Now consider what that analogizes to, in the case at hand. Is a downvote, or a critical comment, more like being fired, or more like your boss telling you that your work isn’t up to par and that you should really try to do better?
I think it’s sort of like your boss telling you your work isn’t good, when your boss also isn’t paying you and you’re there as a volunteer.
If your boss isn’t paying you, then what’s the point of the employment analogy? That’s not employment at all, is it?
… what? Of course you only pay your plumber if they successfully fix your toilet!
My experience is definitely the opposite. Random Quora question also suggests that it’s common practice in plumbing to pay someone for the attempt, not for the solution. As someone who recently hired plumbers and electricians to fix a bunch of stuff in a new house we rented, this also matches with my experience. Not sure where your experience comes from.
In general, most contractors bill by the hour, not for completed output, and definitely not “output that the client thinks is worth it”, at least in my experience (there are obviously exceptions, though I found them relatively rare).
My experience comes from the same sort of thing: having, on many occasions, hired various people to do various sorts of work; and also from having worked for several years working at a computer store that specialized in on-the-premises repair/service.
The Quora answer you linked doesn’t really support your point, as it’s quite clear about the prerequisite being an informed, explicit agreement between plumber and customer that the latter will pay the former regardless of outcome. (And even with that caveat, some of what the answer-giver says is suspect, and is not consistent with my experience.)
I do not know of any industry in which contractor agreements with variable payments that are dependent on the quality of the output are common practice. There is often an agreement on what it means to “complete the work” but in almost any case both your downside and your upside are limited by a guaranteed upfront payment, and a conditional final payment. But it’s almost never the case that you can get 2x the money depending on the quality of your output, which seems like a necessary requirement for some of the incentive schemes you outlined.
What does this have to do with anything? You originally said:
I don’t see the connection between “should you pay your plumber even if they don’t actually fix your toilet” and “should you pay your plumber twice as much if they fix your toilet twice as well”; the latter seems like a nonsensical question, and unrelated to the former.