Yeah, I thought about that as well. Trying to suppress it made it much more popular and gave it a lot of credibility. If they decided to act in such a way deliberately, that be fascinating. But that sounds like one crazy conspiracy theory to me.
I don’t think it gave it a lot of credibility. Everyone I can think of who isn’t an AI researcher or LW regular who’s read it has immediately thought “that’s ridiculous. You’re seriously concerned about this as a likely consequence? Have you even heard of the Old Testament, or Harlan Ellison? Do you think your AI will avoid reading either?” Note, not the idea itself, but that SIAI took the idea so seriously it suppressed it and keeps trying to. This does not make SIAI look more credible, but less because it looks strange.
These are the people running a site about refining the art of rationality; that makes discussion of this apparent spectacular multi-level failure directly on-topic. It’s also become a defining moment in the history of LessWrong and will be in every history of the site forever. Perhaps there’s some Xanatos retcon by which this can be made to work.
I just have a hard time to believe that they could be so wrong, people who write essays like this. That’s why I allow for the possibility that they are right and that I simply do not understand the issue. Can you rule out that possibility? And if that was the case, what would it mean to spread it even further? You see, that’s my problem.
Indeed. On the other hand, humans frequently use intelligence to do much stupider things than they could have done without that degree of intelligence. Previous brilliance means that future strange ideas should be taken seriously, but not that the future ideas must be even more brilliant because they look so stupid. Ray Kurzweil is an excellent example—an undoubted genius of real achievements, but also now undoubtedly completely off the rails and well into pseudoscience. (Alkaline water!)
I don’t think that’s credible. Eliezer has focused much of his intelligence on avoiding “brilliant stupidity”, orders of magnitude more so than any Kurzweil-esque example.
If you observe an action (A) that you judge so absurd that it casts doubt on the agent’s (G) rationality, then your confidence (C1) in G’s rationality should decrease. If C1 was previously high, then your confidence (C2) in your judgment of A’s absurdity should decrease.
So if someone you strongly trust to be rational does something you strongly suspect to be absurd, the end result ought to be that your trust and your suspicions are both weakened. Then you can ask yourself whether, after that modification, you still trust G’s rationality enough to believe that there exist good reasons for A.
The only reason it feels like a problem is that human brains aren’t good at this. It sometimes helps to write it all down on paper, but mostly it’s just something to practice until it gets easier.
In the meantime, what I would recommend is giving some careful thought to why you trust G, and why you think A is absurd, independent of each other. That is: what’s your evidence? Are C1 and C2 at all calibrated to observed events?
If you conclude at the end of it that they one or the other is unjustified, your problem dissolves and you know which way to jump. No problem.
If you conclude that they are both justified, then your best bet is probably to assume the existence of either evidence or arguments that you’re unaware of (more or less as you’re doing now)… not because “you can’t rule out the possibility” but because it seems more likely than the alternatives. Again, no problem.
And the fact that other people don’t end up in the same place simply reflects the fact that their prior confidence was different, presumably because their experiences were different and they don’t have perfect trust in everyone’s perfect Bayesianness. Again, no problem… you simply disagree.
Working out where you stand can be a useful exercise. In my own experience, I find it significantly diminishes my impulse to argue the point past where anything new is being said, which generally makes me happier.
Another thing: rationality is best expressed as a percentage, not a binary. I might look at the virtues and say “wow, I bet this guy only makes mistakes 10% of the time! That’s fantastic!”- but then when I see something that looks like a mistake, I’m not afraid to call it that. I just expect to see fewer of them.
Everyone I can think of who isn’t an AI researcher or LW regular who’s read it has immediately thought “that’s ridiculous. You’re seriously concerned about this as a likely consequence?”
You could make a similar comment about cryonics. “Everyone I can think of who isn’t a cryonics project member or LW regular who’s read [hypothetical cryonics proposal] has immediately thought “that’s ridiculous. You’re seriously considering this possibility?”. “People think it’s ridiculous” is not always a good argument against it.
Consider that whoever made the decision probably made it according to consequentialist ethics; the consequences of people taking the idea seriously would be worse than the consequences of censorship. As many consequentialist decisions tend to, it failed to take into account the full consequences of breaking with deontological ethics (“no censorship” is a pretty strong injunction). But LessWrong is maybe the one place on the internet you could expect not to suffer for breaking from deontological ethics.
This does not make SIAI look more credible, but less because it looks strange.
Again, strange from a deontologist’s perspective. If you’re a deontologist, okay, your objection to the practice has been noted.
The perfect Bayesian consequentialist, however, would look at the decision, estimate the chances of the decision-maker being irrational (their credibility), and promptly revise their probability estimate of ‘bad idea is actually dangerous’ upwards, enough to approve of censorship. Nothing strange there. You appear to be downgrading SIAI’s credibility because it takes an idea seriously that you don’t—I don’t think you have enough evidence to conclude that they are reasoning imperfectly.
I’m speaking of convincing people who don’t already agree with them. SIAI and LW look silly now in ways they didn’t before.
There may be, as you posit, a good and convincing explanation for the apparently really stupid behaviour. However, to convince said outsiders (who are the ones with the currencies of money and attention), the explanation has to actually be made to said outsiders in an examinable step-by-step fashion. Otherwise they’re well within rights of reasonable discussion not to be convinced. There’s a lot of cranks vying for attention and money, and an organisation has to clearly show itself as better than that to avoid losing.
the explanation has to actually be made to said outsiders in an examinable step-by-step fashion.
By the time a person can grasp the chain of inference, and by the time they are consequentialist and Aumann-agreement-savvy enough for it to work on them, they probably wouldn’t be considered outsiders. I don’t know if there’s a way around that. It is unfortunate.
To generalise your answer: “the inferential distance is too great to show people why we’re actually right.” This does indeed suck, but is indeed not reasonably avoidable.
The approach I would personally try is furiously seeding memes that make the ideas that will help close the inferential distance more plausible. See selling ideas in this excellent post.
For what it’s worth, I gather from various comments he’s made in earlier posts that EY sees the whole enterprise of LW as precisely this “furiously seeding memes” strategy.
Or at least that this is how he saw it when he started; I realize that time has passed and people change their minds.
That is, I think he believes/ed that understanding this particular issue depends on understanding FAI theory depends on understanding cognition (or at least on dissolving common misunderstandings about cognition) and rationality, and that this site (and the book he’s working on) are the best way he knows of to spread the memes that lead to the first step on that chain.
I don’t claim here that he’s right to see it that way, merely that I think he does. That is, I think he’s trying to implement the approach you’re suggesting, given his understanding of the problem.
Well, yes. (I noted it as my approach, but I can’t see another one to approach it with.) Which is why throwing LW’s intellectual integrity under the trolley like this is itself remarkable.
Well, there’s integrity, and then there’s reputation, and they’re different.
For example, my own on-three-minutes-thought proposed approach is similar to Kaminsky’s, though less urgent. (As is, I think, appropriate… more people are working on hacking internet security than on, um, whatever endeavor it is that would lead one to independently discover dangerous ideas about AI. To put it mildly.)
I think that approach has integrity, but it won’t address the issues of reputation: adopting that approach for a threat that most people consider absurd won’t make me seem any less absurd to those people.
However, discussion of the chain of reasoning is on-topic for LessWrong (discussing a spectacularly failed local chain of reasoning and how and why it failed), and continued removal of bits of the discussion does constitute throwing LessWrong’s integrity in front of the trolley.
The perfect Bayesian consequentialist, however, would look at the decision, estimate the chances of the decision-maker being irrational (their credibility), and promptly revise their probability estimate of ‘bad idea is actually dangerous’ upwards, enough to approve of censorship.
There are two things going on here, and you’re missing the other, important one. When a Bayesian consequentialist sees someone break a rule, they perform two operations- reduce the credibility of the person breaking the rule by the damage done, and increase the probability that the rule-breaking was justified by the credibility of the rule-breaker. It’s generally a good idea to do the credibility-reduction first.
Keep in mind that credibility is constructed out of actions (and, to a lesser extent, words), and that people make mistakes. This sounds like captainitis, not wisdom.
You have three options, since you have two adjustments to do and you can use old or new values for each (but only three because you can’t use new values for both).* Adjusting credibility first (i.e. using the old value of the rule’s importance to determine the new credibility, then the new value of credibility to determine the new value of the credibility’s importance) is the defensive play, and it’s generally a good idea to behave defensively.
For example, let’s say your neighbor Tim (credibility .5) tells you that there are aliens out to get him (prior probability 1e-10, say). If you adjust both using the old values, you get that Tim’s credibility has dropped massively, but your belief that aliens are out to get Tim has risen massively. If you adjust the action first (where the ‘rule’ is “don’t believe in aliens having practical effects”), your belief that aliens are out to get Tim rises massively- and then your estimate of Tim’s credibility drops only slightly. If you adjust Tim’s credibility first, you find that his credibility has dropped massively, and thus when you update the probability that aliens are out to get Tim it only bumps up slightly.
*You could iterate this a bunch of times, but that seems silly.
I suppose that could be the case- I’m trying to unpack what exactly I’m thinking of when I think of ‘credibility.’ I can see strong arguments for either approach, depending on what ‘credibility’ is. Originally I was thinking of something along the lines of “prior probability a statement they make will be correct” but as soon as you know the content of the statement, that’s not really relevant- and so now I’m imagining something along the lines of “how much I weight unlikely statements made by them,” or more likely for a real person, “how much effort i put into checking their statements.”
And so for the first one, it doesn’t make sense to update the credibility- if someone previously trustworthy tells you something bizarre, you weight it highly. But for the second one, it does make sense to update the credibility first- if someone previously trustworthy tells you something bizarre, you should immediately become more skeptical of the that statement and subsequent ones.
Thanks for working that out- that made clearer to me what I think I was confused about before. What I was imagining by “update credibility based on their statement” was configuring your credibility estimate to the statement in question- but rather than ‘updating’ that’s just doing a lookup to figure out what Tim’s credibility is for this class of statements.
Looking at shokwave’s comment again with a clearer mind:
The perfect Bayesian consequentialist, however, would look at the decision, estimate the chances of the decision-maker being irrational (their credibility), and promptly revise their probability estimate of ‘bad idea is actually dangerous’ upwards, enough to approve of censorship. Nothing strange there. You appear to be downgrading SIAI’s credibility because it takes an idea seriously that you don’t—I don’t think you have enough evidence to conclude that they are reasoning imperfectly.
When you estimate the chances that the decision-maker is irrational, I feel you need to include the fact that you disagree with them now (my original position of playing defensively), instead of just looking at your past.
Why? Because it reduces the chances you get stuck in a trap- if you agree with Tim on propositions 1-10 and disagree on proposition 11, you might say “well, Tim might know something I don’t, I’ll change my position to agree with his.” Then, when you disagree on proposition 12, you look back at your history and see that you agree with Tim on everything else, so maybe he knows something you don’t. Now, even though you changed your position on proposition 11, you probably did decrease Tim’s credibility- maybe you have stored “we agreed on 10 (or 10.5 or whatever) of 11 propositions.”
So, when we ask “does SIAI censor rationally?” it seems like we should take the current incident into account before we decide whether or not to take their word on their censorship. It’s also rather helpful to ask that narrower question, instead of “is SIAI rational?”, because general rationality does not translate to competence in narrow situations.
So, when we ask “does SIAI censor rationally?” it seems like we should take the current incident into account before we decide whether or not to take their word on their censorship.
This is a subtle part of Bayesian updating. The question “does SIAI censor rationally?” is different to “was SIAI’s decision to censor this case made rationally?” (it is different because in the second case we have some weak evidence that it was not—ie, that we as rationalists would not have made the decision they did). We used our prior for “SIAI acts rationally” to determine or derive the probability of “SIAI censors rationally” (as you astutely pointed out, general rationality is not perfectly transitive), and then used “SIAI censors rationally” as our prior for the calculation of “did SIAI censor rationally in this case”.
After our calculation, “did SIAI censor rationally in this case” is necessarily going to be lower in probability than our prior “SIAI censors rationally.” Then, we can re-assess “SIAI censors rationally” in light of the fact that one of the cases of rational censorship has a higher level of uncertainty (now, our resolved disagreement is weaker evidence that SIAI does not censor rationally). That will revise “SIAI censors rationally” downwards—but not down to the level of “did SIAI censor rationally in this case”.
To use your Tim’s propositions example, you would want your estimation of proposition 12 to depend on not only how much you disagreed with him on prop 11, but also how much you agreed with him on props 1-10.
Perfect-Bayesian-Aumann-agreeing isn’t binary about agreement; it would continue to increase the value of “stuff Tim knows that you don’t” until it’s easier to reduce the value of “Tim is a perfect Bayesian reasoner about aliens”—in other words, at about prop 13-14 the hypothesis “Tim is stupid with respect to aliens existing” would occur to you, and at prop 20 “Tim is stupid WRT aliens” and “Tim knows something I don’t WRT aliens” would be equally likely.
It was left up for ages before the censorship. The Streisand effect is well known. Yes, this is a crazy kind of marketing stunt—but also one that shows Yu’El’s compassion for the tender and unprotected minds of his flock—his power over the other participants—and one that adds to the community folklore.
Yeah, I thought about that as well. Trying to suppress it made it much more popular and gave it a lot of credibility. If they decided to act in such a way deliberately, that be fascinating. But that sounds like one crazy conspiracy theory to me.
I don’t think it gave it a lot of credibility. Everyone I can think of who isn’t an AI researcher or LW regular who’s read it has immediately thought “that’s ridiculous. You’re seriously concerned about this as a likely consequence? Have you even heard of the Old Testament, or Harlan Ellison? Do you think your AI will avoid reading either?” Note, not the idea itself, but that SIAI took the idea so seriously it suppressed it and keeps trying to. This does not make SIAI look more credible, but less because it looks strange.
These are the people running a site about refining the art of rationality; that makes discussion of this apparent spectacular multi-level failure directly on-topic. It’s also become a defining moment in the history of LessWrong and will be in every history of the site forever. Perhaps there’s some Xanatos retcon by which this can be made to work.
I just have a hard time to believe that they could be so wrong, people who write essays like this. That’s why I allow for the possibility that they are right and that I simply do not understand the issue. Can you rule out that possibility? And if that was the case, what would it mean to spread it even further? You see, that’s my problem.
Indeed. On the other hand, humans frequently use intelligence to do much stupider things than they could have done without that degree of intelligence. Previous brilliance means that future strange ideas should be taken seriously, but not that the future ideas must be even more brilliant because they look so stupid. Ray Kurzweil is an excellent example—an undoubted genius of real achievements, but also now undoubtedly completely off the rails and well into pseudoscience. (Alkaline water!)
Ray on alkaline water:
http://glowing-health.com/alkaline-water/ray-kurzweil-alkaine-water.html
See, RationalWiki is a silly wiki full of rude people. But one thing we know a lot about, is woo. That reads like a parody of woo.
Scary.
I don’t think that’s credible. Eliezer has focused much of his intelligence on avoiding “brilliant stupidity”, orders of magnitude more so than any Kurzweil-esque example.
So the thing to do in this situation is to ask them: “excuse me wtf are you doin?” And this has been done.
So far there’s been no explanation, nor even acknowledgement of how profoundly stupid this looks. This does nothing to make them look smarter.
Of course, as I noted, a truly amazing Xanatos retcon is indeed not impossible.
There is no problem.
If you observe an action (A) that you judge so absurd that it casts doubt on the agent’s (G) rationality, then your confidence (C1) in G’s rationality should decrease. If C1 was previously high, then your confidence (C2) in your judgment of A’s absurdity should decrease.
So if someone you strongly trust to be rational does something you strongly suspect to be absurd, the end result ought to be that your trust and your suspicions are both weakened. Then you can ask yourself whether, after that modification, you still trust G’s rationality enough to believe that there exist good reasons for A.
The only reason it feels like a problem is that human brains aren’t good at this. It sometimes helps to write it all down on paper, but mostly it’s just something to practice until it gets easier.
In the meantime, what I would recommend is giving some careful thought to why you trust G, and why you think A is absurd, independent of each other. That is: what’s your evidence? Are C1 and C2 at all calibrated to observed events?
If you conclude at the end of it that they one or the other is unjustified, your problem dissolves and you know which way to jump. No problem.
If you conclude that they are both justified, then your best bet is probably to assume the existence of either evidence or arguments that you’re unaware of (more or less as you’re doing now)… not because “you can’t rule out the possibility” but because it seems more likely than the alternatives. Again, no problem.
And the fact that other people don’t end up in the same place simply reflects the fact that their prior confidence was different, presumably because their experiences were different and they don’t have perfect trust in everyone’s perfect Bayesianness. Again, no problem… you simply disagree.
Working out where you stand can be a useful exercise. In my own experience, I find it significantly diminishes my impulse to argue the point past where anything new is being said, which generally makes me happier.
This comment is also relevant.
Another thing: rationality is best expressed as a percentage, not a binary. I might look at the virtues and say “wow, I bet this guy only makes mistakes 10% of the time! That’s fantastic!”- but then when I see something that looks like a mistake, I’m not afraid to call it that. I just expect to see fewer of them.
What issue? The forbidden one? You are not even supposed to be thinking about that! For pennance, go and say 30 “Hail Yudkowskys”!
You could make a similar comment about cryonics. “Everyone I can think of who isn’t a cryonics project member or LW regular who’s read [hypothetical cryonics proposal] has immediately thought “that’s ridiculous. You’re seriously considering this possibility?”. “People think it’s ridiculous” is not always a good argument against it.
Consider that whoever made the decision probably made it according to consequentialist ethics; the consequences of people taking the idea seriously would be worse than the consequences of censorship. As many consequentialist decisions tend to, it failed to take into account the full consequences of breaking with deontological ethics (“no censorship” is a pretty strong injunction). But LessWrong is maybe the one place on the internet you could expect not to suffer for breaking from deontological ethics.
Again, strange from a deontologist’s perspective. If you’re a deontologist, okay, your objection to the practice has been noted.
The perfect Bayesian consequentialist, however, would look at the decision, estimate the chances of the decision-maker being irrational (their credibility), and promptly revise their probability estimate of ‘bad idea is actually dangerous’ upwards, enough to approve of censorship. Nothing strange there. You appear to be downgrading SIAI’s credibility because it takes an idea seriously that you don’t—I don’t think you have enough evidence to conclude that they are reasoning imperfectly.
I’m speaking of convincing people who don’t already agree with them. SIAI and LW look silly now in ways they didn’t before.
There may be, as you posit, a good and convincing explanation for the apparently really stupid behaviour. However, to convince said outsiders (who are the ones with the currencies of money and attention), the explanation has to actually be made to said outsiders in an examinable step-by-step fashion. Otherwise they’re well within rights of reasonable discussion not to be convinced. There’s a lot of cranks vying for attention and money, and an organisation has to clearly show itself as better than that to avoid losing.
By the time a person can grasp the chain of inference, and by the time they are consequentialist and Aumann-agreement-savvy enough for it to work on them, they probably wouldn’t be considered outsiders. I don’t know if there’s a way around that. It is unfortunate.
To generalise your answer: “the inferential distance is too great to show people why we’re actually right.” This does indeed suck, but is indeed not reasonably avoidable.
The approach I would personally try is furiously seeding memes that make the ideas that will help close the inferential distance more plausible. See selling ideas in this excellent post.
For what it’s worth, I gather from various comments he’s made in earlier posts that EY sees the whole enterprise of LW as precisely this “furiously seeding memes” strategy.
Or at least that this is how he saw it when he started; I realize that time has passed and people change their minds.
That is, I think he believes/ed that understanding this particular issue depends on understanding FAI theory depends on understanding cognition (or at least on dissolving common misunderstandings about cognition) and rationality, and that this site (and the book he’s working on) are the best way he knows of to spread the memes that lead to the first step on that chain.
I don’t claim here that he’s right to see it that way, merely that I think he does. That is, I think he’s trying to implement the approach you’re suggesting, given his understanding of the problem.
Well, yes. (I noted it as my approach, but I can’t see another one to approach it with.) Which is why throwing LW’s intellectual integrity under the trolley like this is itself remarkable.
Well, there’s integrity, and then there’s reputation, and they’re different.
For example, my own on-three-minutes-thought proposed approach is similar to Kaminsky’s, though less urgent. (As is, I think, appropriate… more people are working on hacking internet security than on, um, whatever endeavor it is that would lead one to independently discover dangerous ideas about AI. To put it mildly.)
I think that approach has integrity, but it won’t address the issues of reputation: adopting that approach for a threat that most people consider absurd won’t make me seem any less absurd to those people.
However, discussion of the chain of reasoning is on-topic for LessWrong (discussing a spectacularly failed local chain of reasoning and how and why it failed), and continued removal of bits of the discussion does constitute throwing LessWrong’s integrity in front of the trolley.
There are two things going on here, and you’re missing the other, important one. When a Bayesian consequentialist sees someone break a rule, they perform two operations- reduce the credibility of the person breaking the rule by the damage done, and increase the probability that the rule-breaking was justified by the credibility of the rule-breaker. It’s generally a good idea to do the credibility-reduction first.
Keep in mind that credibility is constructed out of actions (and, to a lesser extent, words), and that people make mistakes. This sounds like captainitis, not wisdom.
Aside:
Why would it matter?
You have three options, since you have two adjustments to do and you can use old or new values for each (but only three because you can’t use new values for both).* Adjusting credibility first (i.e. using the old value of the rule’s importance to determine the new credibility, then the new value of credibility to determine the new value of the credibility’s importance) is the defensive play, and it’s generally a good idea to behave defensively.
For example, let’s say your neighbor Tim (credibility .5) tells you that there are aliens out to get him (prior probability 1e-10, say). If you adjust both using the old values, you get that Tim’s credibility has dropped massively, but your belief that aliens are out to get Tim has risen massively. If you adjust the action first (where the ‘rule’ is “don’t believe in aliens having practical effects”), your belief that aliens are out to get Tim rises massively- and then your estimate of Tim’s credibility drops only slightly. If you adjust Tim’s credibility first, you find that his credibility has dropped massively, and thus when you update the probability that aliens are out to get Tim it only bumps up slightly.
*You could iterate this a bunch of times, but that seems silly.
Er, any update that doesn’t use the old values for both is just wrong. If you use new values you’re double-counting the evidence.
I suppose that could be the case- I’m trying to unpack what exactly I’m thinking of when I think of ‘credibility.’ I can see strong arguments for either approach, depending on what ‘credibility’ is. Originally I was thinking of something along the lines of “prior probability a statement they make will be correct” but as soon as you know the content of the statement, that’s not really relevant- and so now I’m imagining something along the lines of “how much I weight unlikely statements made by them,” or more likely for a real person, “how much effort i put into checking their statements.”
And so for the first one, it doesn’t make sense to update the credibility- if someone previously trustworthy tells you something bizarre, you weight it highly. But for the second one, it does make sense to update the credibility first- if someone previously trustworthy tells you something bizarre, you should immediately become more skeptical of the that statement and subsequent ones.
But no more skeptical than is warranted by your prior probability.
Let’s say that if aliens exist, a reliable Tim has a 99% probability of saying they do. If they don’t, he has a 1% probability of saying they do.
An unreliable Tim has a 50⁄50 shot in either situation.
My prior was 50⁄50 reliable/unreliable, 1,000,000⁄1 don’t exist, exist so prior weights:
reliable, exist: 1 unreliable, exist: 1 reliable, don’t exist: 1,000,000 unreliable, don’t exist, 1,000,000
Updates after he says they do:
reliable, exist: .99 unreliable, exist: .5 reliable, don’t exist: 10,000 unreliable, don’t exist: 500,000
So we now believe approximately 50 to1 that he’s unreliable, and 510,000 to 1.49 or 342,000 to 1 that they don’t exist.
This is what you get if you decide each of the new based on the old.
Thanks for working that out- that made clearer to me what I think I was confused about before. What I was imagining by “update credibility based on their statement” was configuring your credibility estimate to the statement in question- but rather than ‘updating’ that’s just doing a lookup to figure out what Tim’s credibility is for this class of statements.
Looking at shokwave’s comment again with a clearer mind:
When you estimate the chances that the decision-maker is irrational, I feel you need to include the fact that you disagree with them now (my original position of playing defensively), instead of just looking at your past.
Why? Because it reduces the chances you get stuck in a trap- if you agree with Tim on propositions 1-10 and disagree on proposition 11, you might say “well, Tim might know something I don’t, I’ll change my position to agree with his.” Then, when you disagree on proposition 12, you look back at your history and see that you agree with Tim on everything else, so maybe he knows something you don’t. Now, even though you changed your position on proposition 11, you probably did decrease Tim’s credibility- maybe you have stored “we agreed on 10 (or 10.5 or whatever) of 11 propositions.”
So, when we ask “does SIAI censor rationally?” it seems like we should take the current incident into account before we decide whether or not to take their word on their censorship. It’s also rather helpful to ask that narrower question, instead of “is SIAI rational?”, because general rationality does not translate to competence in narrow situations.
This is a subtle part of Bayesian updating. The question “does SIAI censor rationally?” is different to “was SIAI’s decision to censor this case made rationally?” (it is different because in the second case we have some weak evidence that it was not—ie, that we as rationalists would not have made the decision they did). We used our prior for “SIAI acts rationally” to determine or derive the probability of “SIAI censors rationally” (as you astutely pointed out, general rationality is not perfectly transitive), and then used “SIAI censors rationally” as our prior for the calculation of “did SIAI censor rationally in this case”.
After our calculation, “did SIAI censor rationally in this case” is necessarily going to be lower in probability than our prior “SIAI censors rationally.” Then, we can re-assess “SIAI censors rationally” in light of the fact that one of the cases of rational censorship has a higher level of uncertainty (now, our resolved disagreement is weaker evidence that SIAI does not censor rationally). That will revise “SIAI censors rationally” downwards—but not down to the level of “did SIAI censor rationally in this case”.
To use your Tim’s propositions example, you would want your estimation of proposition 12 to depend on not only how much you disagreed with him on prop 11, but also how much you agreed with him on props 1-10.
Perfect-Bayesian-Aumann-agreeing isn’t binary about agreement; it would continue to increase the value of “stuff Tim knows that you don’t” until it’s easier to reduce the value of “Tim is a perfect Bayesian reasoner about aliens”—in other words, at about prop 13-14 the hypothesis “Tim is stupid with respect to aliens existing” would occur to you, and at prop 20 “Tim is stupid WRT aliens” and “Tim knows something I don’t WRT aliens” would be equally likely.
It was left up for ages before the censorship. The Streisand effect is well known. Yes, this is a crazy kind of marketing stunt—but also one that shows Yu’El’s compassion for the tender and unprotected minds of his flock—his power over the other participants—and one that adds to the community folklore.