This reminds me of the winner’s curse. When the blackmailer is optimizing for outrageousness, the outrage caused by their blackmail is predictably too much.
Winner’s Curse doesn’t seem like the right effect to me—it seems more like an orthogonality/Goodhart effect, where optimizing for outrageousness decreases the fitness w/r/t social welfare (on the margin). It’s always in the blackmailer’s interest to make the outrageousness greater, so they’re not (selfishly) sad when they overshoot.
My model: for each issue (example: rape, homosexuality, etc.) there is an ideal amount of outrage, from “none” to “burn them at the stake”. (“Ideal” meaning the amount that best achieves human goals, or something similar.) A given culture might approximate these amounts, but with error. Sometimes it will have more outrage than ideal, and sometimes it will have less.
A blackmailer is trying to maximize potential outrage. (They have limited resources and can only blackmail so many people. If Alice’s secret would make people avoid her if it got out, and Bob’s secret would make people murder him, then the blackmailer will blackmail Bob.) The blackmailer can form a relatively accurate model of what issues their culture is most outraged about, so they will maximize outrage rather well.
By analogy with the winner’s curse, if x is argmax(outrage(issue)), the culture has probably overestimated the necessary amount of outrage for x.
This reminds me of the winner’s curse. When the blackmailer is optimizing for outrageousness, the outrage caused by their blackmail is predictably too much.
Winner’s Curse doesn’t seem like the right effect to me—it seems more like an orthogonality/Goodhart effect, where optimizing for outrageousness decreases the fitness w/r/t social welfare (on the margin). It’s always in the blackmailer’s interest to make the outrageousness greater, so they’re not (selfishly) sad when they overshoot.
My model: for each issue (example: rape, homosexuality, etc.) there is an ideal amount of outrage, from “none” to “burn them at the stake”. (“Ideal” meaning the amount that best achieves human goals, or something similar.) A given culture might approximate these amounts, but with error. Sometimes it will have more outrage than ideal, and sometimes it will have less.
A blackmailer is trying to maximize potential outrage. (They have limited resources and can only blackmail so many people. If Alice’s secret would make people avoid her if it got out, and Bob’s secret would make people murder him, then the blackmailer will blackmail Bob.) The blackmailer can form a relatively accurate model of what issues their culture is most outraged about, so they will maximize outrage rather well.
By analogy with the winner’s curse, if x is argmax(outrage(issue)), the culture has probably overestimated the necessary amount of outrage for x.