The perfect Bayesian consequentialist, however, would look at the decision, estimate the chances of the decision-maker being irrational (their credibility), and promptly revise their probability estimate of ‘bad idea is actually dangerous’ upwards, enough to approve of censorship.
There are two things going on here, and you’re missing the other, important one. When a Bayesian consequentialist sees someone break a rule, they perform two operations- reduce the credibility of the person breaking the rule by the damage done, and increase the probability that the rule-breaking was justified by the credibility of the rule-breaker. It’s generally a good idea to do the credibility-reduction first.
Keep in mind that credibility is constructed out of actions (and, to a lesser extent, words), and that people make mistakes. This sounds like captainitis, not wisdom.
You have three options, since you have two adjustments to do and you can use old or new values for each (but only three because you can’t use new values for both).* Adjusting credibility first (i.e. using the old value of the rule’s importance to determine the new credibility, then the new value of credibility to determine the new value of the credibility’s importance) is the defensive play, and it’s generally a good idea to behave defensively.
For example, let’s say your neighbor Tim (credibility .5) tells you that there are aliens out to get him (prior probability 1e-10, say). If you adjust both using the old values, you get that Tim’s credibility has dropped massively, but your belief that aliens are out to get Tim has risen massively. If you adjust the action first (where the ‘rule’ is “don’t believe in aliens having practical effects”), your belief that aliens are out to get Tim rises massively- and then your estimate of Tim’s credibility drops only slightly. If you adjust Tim’s credibility first, you find that his credibility has dropped massively, and thus when you update the probability that aliens are out to get Tim it only bumps up slightly.
*You could iterate this a bunch of times, but that seems silly.
I suppose that could be the case- I’m trying to unpack what exactly I’m thinking of when I think of ‘credibility.’ I can see strong arguments for either approach, depending on what ‘credibility’ is. Originally I was thinking of something along the lines of “prior probability a statement they make will be correct” but as soon as you know the content of the statement, that’s not really relevant- and so now I’m imagining something along the lines of “how much I weight unlikely statements made by them,” or more likely for a real person, “how much effort i put into checking their statements.”
And so for the first one, it doesn’t make sense to update the credibility- if someone previously trustworthy tells you something bizarre, you weight it highly. But for the second one, it does make sense to update the credibility first- if someone previously trustworthy tells you something bizarre, you should immediately become more skeptical of the that statement and subsequent ones.
Thanks for working that out- that made clearer to me what I think I was confused about before. What I was imagining by “update credibility based on their statement” was configuring your credibility estimate to the statement in question- but rather than ‘updating’ that’s just doing a lookup to figure out what Tim’s credibility is for this class of statements.
Looking at shokwave’s comment again with a clearer mind:
The perfect Bayesian consequentialist, however, would look at the decision, estimate the chances of the decision-maker being irrational (their credibility), and promptly revise their probability estimate of ‘bad idea is actually dangerous’ upwards, enough to approve of censorship. Nothing strange there. You appear to be downgrading SIAI’s credibility because it takes an idea seriously that you don’t—I don’t think you have enough evidence to conclude that they are reasoning imperfectly.
When you estimate the chances that the decision-maker is irrational, I feel you need to include the fact that you disagree with them now (my original position of playing defensively), instead of just looking at your past.
Why? Because it reduces the chances you get stuck in a trap- if you agree with Tim on propositions 1-10 and disagree on proposition 11, you might say “well, Tim might know something I don’t, I’ll change my position to agree with his.” Then, when you disagree on proposition 12, you look back at your history and see that you agree with Tim on everything else, so maybe he knows something you don’t. Now, even though you changed your position on proposition 11, you probably did decrease Tim’s credibility- maybe you have stored “we agreed on 10 (or 10.5 or whatever) of 11 propositions.”
So, when we ask “does SIAI censor rationally?” it seems like we should take the current incident into account before we decide whether or not to take their word on their censorship. It’s also rather helpful to ask that narrower question, instead of “is SIAI rational?”, because general rationality does not translate to competence in narrow situations.
So, when we ask “does SIAI censor rationally?” it seems like we should take the current incident into account before we decide whether or not to take their word on their censorship.
This is a subtle part of Bayesian updating. The question “does SIAI censor rationally?” is different to “was SIAI’s decision to censor this case made rationally?” (it is different because in the second case we have some weak evidence that it was not—ie, that we as rationalists would not have made the decision they did). We used our prior for “SIAI acts rationally” to determine or derive the probability of “SIAI censors rationally” (as you astutely pointed out, general rationality is not perfectly transitive), and then used “SIAI censors rationally” as our prior for the calculation of “did SIAI censor rationally in this case”.
After our calculation, “did SIAI censor rationally in this case” is necessarily going to be lower in probability than our prior “SIAI censors rationally.” Then, we can re-assess “SIAI censors rationally” in light of the fact that one of the cases of rational censorship has a higher level of uncertainty (now, our resolved disagreement is weaker evidence that SIAI does not censor rationally). That will revise “SIAI censors rationally” downwards—but not down to the level of “did SIAI censor rationally in this case”.
To use your Tim’s propositions example, you would want your estimation of proposition 12 to depend on not only how much you disagreed with him on prop 11, but also how much you agreed with him on props 1-10.
Perfect-Bayesian-Aumann-agreeing isn’t binary about agreement; it would continue to increase the value of “stuff Tim knows that you don’t” until it’s easier to reduce the value of “Tim is a perfect Bayesian reasoner about aliens”—in other words, at about prop 13-14 the hypothesis “Tim is stupid with respect to aliens existing” would occur to you, and at prop 20 “Tim is stupid WRT aliens” and “Tim knows something I don’t WRT aliens” would be equally likely.
There are two things going on here, and you’re missing the other, important one. When a Bayesian consequentialist sees someone break a rule, they perform two operations- reduce the credibility of the person breaking the rule by the damage done, and increase the probability that the rule-breaking was justified by the credibility of the rule-breaker. It’s generally a good idea to do the credibility-reduction first.
Keep in mind that credibility is constructed out of actions (and, to a lesser extent, words), and that people make mistakes. This sounds like captainitis, not wisdom.
Aside:
Why would it matter?
You have three options, since you have two adjustments to do and you can use old or new values for each (but only three because you can’t use new values for both).* Adjusting credibility first (i.e. using the old value of the rule’s importance to determine the new credibility, then the new value of credibility to determine the new value of the credibility’s importance) is the defensive play, and it’s generally a good idea to behave defensively.
For example, let’s say your neighbor Tim (credibility .5) tells you that there are aliens out to get him (prior probability 1e-10, say). If you adjust both using the old values, you get that Tim’s credibility has dropped massively, but your belief that aliens are out to get Tim has risen massively. If you adjust the action first (where the ‘rule’ is “don’t believe in aliens having practical effects”), your belief that aliens are out to get Tim rises massively- and then your estimate of Tim’s credibility drops only slightly. If you adjust Tim’s credibility first, you find that his credibility has dropped massively, and thus when you update the probability that aliens are out to get Tim it only bumps up slightly.
*You could iterate this a bunch of times, but that seems silly.
Er, any update that doesn’t use the old values for both is just wrong. If you use new values you’re double-counting the evidence.
I suppose that could be the case- I’m trying to unpack what exactly I’m thinking of when I think of ‘credibility.’ I can see strong arguments for either approach, depending on what ‘credibility’ is. Originally I was thinking of something along the lines of “prior probability a statement they make will be correct” but as soon as you know the content of the statement, that’s not really relevant- and so now I’m imagining something along the lines of “how much I weight unlikely statements made by them,” or more likely for a real person, “how much effort i put into checking their statements.”
And so for the first one, it doesn’t make sense to update the credibility- if someone previously trustworthy tells you something bizarre, you weight it highly. But for the second one, it does make sense to update the credibility first- if someone previously trustworthy tells you something bizarre, you should immediately become more skeptical of the that statement and subsequent ones.
But no more skeptical than is warranted by your prior probability.
Let’s say that if aliens exist, a reliable Tim has a 99% probability of saying they do. If they don’t, he has a 1% probability of saying they do.
An unreliable Tim has a 50⁄50 shot in either situation.
My prior was 50⁄50 reliable/unreliable, 1,000,000⁄1 don’t exist, exist so prior weights:
reliable, exist: 1 unreliable, exist: 1 reliable, don’t exist: 1,000,000 unreliable, don’t exist, 1,000,000
Updates after he says they do:
reliable, exist: .99 unreliable, exist: .5 reliable, don’t exist: 10,000 unreliable, don’t exist: 500,000
So we now believe approximately 50 to1 that he’s unreliable, and 510,000 to 1.49 or 342,000 to 1 that they don’t exist.
This is what you get if you decide each of the new based on the old.
Thanks for working that out- that made clearer to me what I think I was confused about before. What I was imagining by “update credibility based on their statement” was configuring your credibility estimate to the statement in question- but rather than ‘updating’ that’s just doing a lookup to figure out what Tim’s credibility is for this class of statements.
Looking at shokwave’s comment again with a clearer mind:
When you estimate the chances that the decision-maker is irrational, I feel you need to include the fact that you disagree with them now (my original position of playing defensively), instead of just looking at your past.
Why? Because it reduces the chances you get stuck in a trap- if you agree with Tim on propositions 1-10 and disagree on proposition 11, you might say “well, Tim might know something I don’t, I’ll change my position to agree with his.” Then, when you disagree on proposition 12, you look back at your history and see that you agree with Tim on everything else, so maybe he knows something you don’t. Now, even though you changed your position on proposition 11, you probably did decrease Tim’s credibility- maybe you have stored “we agreed on 10 (or 10.5 or whatever) of 11 propositions.”
So, when we ask “does SIAI censor rationally?” it seems like we should take the current incident into account before we decide whether or not to take their word on their censorship. It’s also rather helpful to ask that narrower question, instead of “is SIAI rational?”, because general rationality does not translate to competence in narrow situations.
This is a subtle part of Bayesian updating. The question “does SIAI censor rationally?” is different to “was SIAI’s decision to censor this case made rationally?” (it is different because in the second case we have some weak evidence that it was not—ie, that we as rationalists would not have made the decision they did). We used our prior for “SIAI acts rationally” to determine or derive the probability of “SIAI censors rationally” (as you astutely pointed out, general rationality is not perfectly transitive), and then used “SIAI censors rationally” as our prior for the calculation of “did SIAI censor rationally in this case”.
After our calculation, “did SIAI censor rationally in this case” is necessarily going to be lower in probability than our prior “SIAI censors rationally.” Then, we can re-assess “SIAI censors rationally” in light of the fact that one of the cases of rational censorship has a higher level of uncertainty (now, our resolved disagreement is weaker evidence that SIAI does not censor rationally). That will revise “SIAI censors rationally” downwards—but not down to the level of “did SIAI censor rationally in this case”.
To use your Tim’s propositions example, you would want your estimation of proposition 12 to depend on not only how much you disagreed with him on prop 11, but also how much you agreed with him on props 1-10.
Perfect-Bayesian-Aumann-agreeing isn’t binary about agreement; it would continue to increase the value of “stuff Tim knows that you don’t” until it’s easier to reduce the value of “Tim is a perfect Bayesian reasoner about aliens”—in other words, at about prop 13-14 the hypothesis “Tim is stupid with respect to aliens existing” would occur to you, and at prop 20 “Tim is stupid WRT aliens” and “Tim knows something I don’t WRT aliens” would be equally likely.