My point is there’s a very tenuous jump from us making decisions to how/whether to enforce our preferences on others.
I think the big link I would point to is “politics/economics.” The spherical cows in a vacuum model of a modern democracy might be something like “a bunch of agents with different goals, that use voting as a consensus-building and standardization mechanism to decide what rules they want enforced, and contribute resources towards the costs of that enforcement.”
When it comes to notions of fairness, I think we agree that there is no single standard which applies in all domains in all places. I would frame it as an XKCD 927 situation, where there are multiple standards being applied in different jurisdictions, and within the same jurisdiction when it comes to different domains. (E.g. restitution vs damages.)
When it comes to a fungible resource like money or pie, I believe Yudkowsky’s take is “a fair split is an equal split of the resource itself.” One third each for three people deciding how to split a pie. There are well-defined extensions for different types of non-fungibility, and the type of “fairness” achieved seems to be domain-specific.
There are also results in game theory regarding “what does a good outcome for bargaining games look like?” These are also well-defined, and requiring different axioms leads to different bargaining solutions. My current favorite way of defining “fairness” for a bargaining game is the Kalai-Smorodinsky bargaining solution. At the meta-level I’m more confident about the attractive qualities of Yudkowsky’s probabilistic rejection model. Which includes working pretty well even when participants disagree about how to define “fairness”, and not giving anyone an incentive to exaggerate what they think is fair for them to receive. (Source might contain spoilers for Project Lawful but Yudkowsky describes the probabilistic rejection model here, and I discuss it more here.)
Applying Yudkowsky’s Algorithm to the labor scenario you described might look like having more fairness-oriented negotiations about “under what circumstances a worker can be fired”, “what compensation fired workers can expect to receive”, and “how much additional work can other workers be expected to perform without an increase in marginal compensation rate.” That negotiation might happen at the level of individual workers, unions, labor regulations, or a convoluted patchwork of those and more. I think historically we’ve made significant gains in defining and enforcing standards for things like fair wages and adequate safety.
I love the probabalistic rejection idea—it’s clever and fun. But it depends a LOT on communication or repetition-with-identity so the offerer has any clue that’s the algorithm in play. And in that case, the probabalistic element is unnecessary—simple precommitment is enough (and, in strictly-controlled games without repetition, allowing the reponder to publicly and enforceably precommit just reverses the positions).
I think our main disagreement is on what to do when one or more participants in one-shot (or fixed-length) games are truly selfish, and the payouts listed are fully correct in utility, after accounting for any empathy or desire for fairness. Taboo “fair”, and substitute “optimizing for self”. Shapley values are a good indicator of bargaining power for some kinds of game, but the assumption of symmetry is hard to justify.
Totally! One of the most impressive results I’ve seen for one-shot games is the Robust Cooperation paper studying the open-source Prisoners’ Dilemma, where each player delegates their decision to a program that will learn the exact source code of the other delegate at runtime. Even utterly selfish agents have an incentive to delegate their decision to a program like FairBot or PrudentBot.
I think the probabilistic element helps to preserve expected utility in cases where the demands from each negotiator exceed the total amount of resources being bargained over. If each precommits to demand $60 when splitting $100, deterministic rejection leads to ($0, $0) with 100% probability. Whereas probabilistic rejection calls for the evaluator to accept with probability slightly less than $40/$60 ≈ 66.67%. Accepting leads to a payoff of ($60, $40), for an expected joint utility of slightly less than ≈ ($40, $26.67).
I think there are also totally situations where the asymmetrical power dynamics you’re talking about mean that one agent gets to dictate terms and the other gets what they get. Such as “Alice gets to unilaterally decide how $100 will be split, and Bob gets whatever Alice gives him.” In the one-shot version of this with selfish players, Alice just takes the $100 and Bob gets $0. Any hope for getting a selfish Alice to do anything else is going to come from incentives beyond this one interaction.
I think the big link I would point to is “politics/economics.” The spherical cows in a vacuum model of a modern democracy might be something like “a bunch of agents with different goals, that use voting as a consensus-building and standardization mechanism to decide what rules they want enforced, and contribute resources towards the costs of that enforcement.”
When it comes to notions of fairness, I think we agree that there is no single standard which applies in all domains in all places. I would frame it as an XKCD 927 situation, where there are multiple standards being applied in different jurisdictions, and within the same jurisdiction when it comes to different domains. (E.g. restitution vs damages.)
When it comes to a fungible resource like money or pie, I believe Yudkowsky’s take is “a fair split is an equal split of the resource itself.” One third each for three people deciding how to split a pie. There are well-defined extensions for different types of non-fungibility, and the type of “fairness” achieved seems to be domain-specific.
There are also results in game theory regarding “what does a good outcome for bargaining games look like?” These are also well-defined, and requiring different axioms leads to different bargaining solutions. My current favorite way of defining “fairness” for a bargaining game is the Kalai-Smorodinsky bargaining solution. At the meta-level I’m more confident about the attractive qualities of Yudkowsky’s probabilistic rejection model. Which includes working pretty well even when participants disagree about how to define “fairness”, and not giving anyone an incentive to exaggerate what they think is fair for them to receive. (Source might contain spoilers for Project Lawful but Yudkowsky describes the probabilistic rejection model here, and I discuss it more here.)
Applying Yudkowsky’s Algorithm to the labor scenario you described might look like having more fairness-oriented negotiations about “under what circumstances a worker can be fired”, “what compensation fired workers can expect to receive”, and “how much additional work can other workers be expected to perform without an increase in marginal compensation rate.” That negotiation might happen at the level of individual workers, unions, labor regulations, or a convoluted patchwork of those and more. I think historically we’ve made significant gains in defining and enforcing standards for things like fair wages and adequate safety.
I love the probabalistic rejection idea—it’s clever and fun. But it depends a LOT on communication or repetition-with-identity so the offerer has any clue that’s the algorithm in play. And in that case, the probabalistic element is unnecessary—simple precommitment is enough (and, in strictly-controlled games without repetition, allowing the reponder to publicly and enforceably precommit just reverses the positions).
I think our main disagreement is on what to do when one or more participants in one-shot (or fixed-length) games are truly selfish, and the payouts listed are fully correct in utility, after accounting for any empathy or desire for fairness. Taboo “fair”, and substitute “optimizing for self”. Shapley values are a good indicator of bargaining power for some kinds of game, but the assumption of symmetry is hard to justify.
Totally! One of the most impressive results I’ve seen for one-shot games is the Robust Cooperation paper studying the open-source Prisoners’ Dilemma, where each player delegates their decision to a program that will learn the exact source code of the other delegate at runtime. Even utterly selfish agents have an incentive to delegate their decision to a program like FairBot or PrudentBot.
I think the probabilistic element helps to preserve expected utility in cases where the demands from each negotiator exceed the total amount of resources being bargained over. If each precommits to demand $60 when splitting $100, deterministic rejection leads to ($0, $0) with 100% probability. Whereas probabilistic rejection calls for the evaluator to accept with probability slightly less than $40/$60 ≈ 66.67%. Accepting leads to a payoff of ($60, $40), for an expected joint utility of slightly less than ≈ ($40, $26.67).
I think there are also totally situations where the asymmetrical power dynamics you’re talking about mean that one agent gets to dictate terms and the other gets what they get. Such as “Alice gets to unilaterally decide how $100 will be split, and Bob gets whatever Alice gives him.” In the one-shot version of this with selfish players, Alice just takes the $100 and Bob gets $0. Any hope for getting a selfish Alice to do anything else is going to come from incentives beyond this one interaction.