I am really interested to see the result of this experiment.
I think the underlying models are extremely plausible, with the next bullet point as a possible exception.
I am aesthetically very skeptical of phrases like “absolutely reliable” (in Problem 4). I don’t think it’s possible for something to be absolutely reliable, and it seems dangerous/brittle to commit to achieving something unachievable. However, this may be primarily an aesthetic issue, since I think the solution presented in Problem 3 is very sensible.
I don’t buy claim 4, “It does actually require a tyrant”. I agree that it isn’t always possible to achieve consensus. I don’t think that hierarchical authority is the only way to solve that problem. Democratic Centralism is a well-tested alternative, for instance.
I find the code of conduct worrisome, at least as presented. The rules seem likely to encourage hypocrisy and dishonesty, since they make psychologically implausible demands which in many cases are undetectable at time of infraction. This could potentially be mitigated by norms encouraging confession/absolution for sins, but otherwise I expect this to have corrosive effects.
I am totally uninterested in joining the experiment, despite my interest in its outcome. I would be likely to be interested in substantially more time-boxed activities with similar expectations.
“norms encouraging confession/absolution for sins” is a somewhat … connotation-laden … phrase, but that’s a big part of it. For instance, one of the norms I want to build is something surrounding rewarding the admission of a mistake (the cliff there is people starting to get off on making mistakes to get rewarded, but I think we can dodge it), and a MAJOR part of the regular check-ins and circles and pair debugs will be a focus on minimizing the pain and guilt of having slipped up, plus high-status people leading the way by making visible their own flaws and failings.
+1 for noticing and concern. Do you have any concrete tweaks or other suggestions that you think might mitigate?
Also: “absolute” is probably the wrong word, yeah. What I’m gesturing toward is the qualitative difference between 99% and 99.99%.
I am aesthetically very skeptical of phrases like “absolutely reliable” (in Problem 4). I don’t think it’s possible for something to be absolutely reliable, and it seems dangerous/brittle to commit to achieving something unachievable. However, this may be primarily an aesthetic issue, since I think the solution presented in Problem 3 is very sensible.
[…]
Also: “absolute” is probably the wrong word, yeah. What I’m gesturing toward is the qualitative difference between 99% and 99.99%.
There’s definitely a qualitative shift for me when something moves from “This is very likely to happen” to “This is a fact in the future and I’ll stop wondering whether it’ll happen.”
While I think it’s good to remember that 0 and 1 are not probabilities, I also think it’s worthwhile to remember that in a human being they can be implemented as something kind of like probabilities. (Otherwise Eliezer’s post wouldn’t have been needed!) Even if in a Bayesian framework we’re just moving the probability beyond some threshold (like Duncan’s 99.99%), it feels to me like a discrete shift to dropping the question about whether it’ll happen.
I think that’s a fine time to use a word like “absolute”, even if only aesthetically.
Yeah, there’s some switch from “am maintaining uncertainty” to “am willing to be certain and absorb the cost of an unpleasant surprise.” Or from “would not be surprised by failure” to “have decided to be surprised by failure.”
Those sound like good ideas for mitigating the corrosive effects I’m worried about.
My personal aesthetic vastly prefers opportunity framings over obligation framings, so my hypothetical version of the dragon army would present things as ideals to aspire to, rather than a code that must not be violated. (Eliezer’s Twelve Virtues of Rationality might be a reasonable model.) I think this would have less chance of being corrosive in the way I’m concerned about. However, for the same reason, it would likely have less force.
Re: absolute. I agree that there can be a qualitative difference between 99% and 99.99%. However, I’m skeptical of systems that require 99.99% reliability to work. Heuristically, I expect complex systems to be stable only if they are highly fault-tolerant and degrade gracefully. (Again, this may still be just an aesthetic difference, since your proposed system does seem to have fault-tolerance and graceful degradation built in.)
However, I’m skeptical of systems that require 99.99% reliability to work. Heuristically, I expect complex systems to be stable only if they are highly fault-tolerant and degrade gracefully.
On the other hand… look at what happens when you simply demand that level of reliability, put in the effort, and get it. From my engineering perspective, that difference looks huge. And it doesn’t stop at 99.99%; the next couple nines are useful too! The level of complexity and usefulness you can build from those components is breathtaking. It’s what makes the 21st century work.
I’d be really curious to see what happens when that same level of uncompromising reliability is demanded of social systems. Maybe it doesn’t work, maybe the analogy fails. But I want to see the answer!
to see what happens when that same level of uncompromising reliability is demanded of social systems
Who exactly will be doing the demanding and what would be price for not delivering?
Authoritarian systems are often capable of delivering short-term reliability by demanding the head of everyone who fails (“making the trains run on time”). Of course pretty soon they are left without any competent professionals.
Do you have examples of systems that reach this kind of reliabilty internally?
Most high-9 systems work by taking lots of low-9 components, and relying on not all of them failing at the same time. I.e. if you have 10 95% systems that fail completely independently, and you only need one of them to work, that gets you like eleven nines (99.9{11}%).
Expecting a person to be 99% reliable is ridiculous. That’s like two sick days per year, ignoring all other possible causes of failing to make a task. Instead you should build systems and organisations that have slack, so that one person failing at a particular point in time doesn’t make a project/org fail.
Well, in general, I’d say achieving that reliability through redundant means is totally reasonable, whether in engineering or people-based systems.
At a component level? Lots of structural components, for example. Airplane wings stay attached at fairly high reliability, and my impression is that while there is plenty of margin in the strength of the attachment, it’s not like the underlying bolts are being replaced because they failed with any regularity.
I remember an aerospace discussion about a component (a pressure switch, I think?). NASA wanted documentation for 6 9s of reliability, and expected some sort of very careful fault tree analysis and testing plan. The contractor instead used an automotive component (brake system, I think?), and produced documentation of field reliability at a level high enough to meet the requirements. Definitely an example where working to get the underlying component that reliable was probably better than building complex redundancy on top of an unreliable component.
Yeah. I’ve got a couple brilliant and highly capable friends/allies/advisors who also STRONGLY prefer opportunity framings over obligation framings. I think that’s one of the things where the pendulum has overcorrected, though—I think the rationality community as a whole is rather correctly allergic to obligation framings, because of bad experiences with badly made obligations in the past, but I think we’re missing out on an important piece of the puzzle. You can run a successful thing that’s, like, “we’ll do this every week for twelve weeks, show up as much as you like!” and you can run a successful thing that’s, like, “we’ll do this if we get enough people to commit for twelve weeks!” and I think the two styles overlap but there’s a LOT of non-overlap, and the Bay Area rationalists are missing half of that.
“we’ll do this if we get enough people to commit for twelve weeks!”
I actually totally buy this. There are some things where you just have to commit, and accept the obligations that come with that.
My hesitation primarily comes from the fact that the code of conduct seems intended to be pervasive. It even has requirements that happen entirely inside your own mind. These seem like bad features for an obligation-based system.
My model is that obligation-based systems work best when they’re concrete and specific, and limited to specific times and circumstances. “Commit to performing specified activities twice a week for twelve weeks” seems good, while “never have a mental lapse of type x” seems bad.
That makes sense, yeah. I’m hoping the cure comes both from the culture-of-gentleness we referenced above, and the above-board “Yep, we’re trying to restructure our thinking here” and people choosing intelligently whether to opt in or opt out.
Good place to keep an eye out for problems, though. Yellow flag.
Edit: also, it’s fair to note that the bits that go on inside someone’s head often aren’t so much “you have to think X” as they are “you can’t act on ~X if that’s what you’re thinking.” Like, the agreement that, however frustrated you might FEEL about the fact that people were keeping you up, you’re in a social contract not to VENT at them, if you didn’t first ask them to stop. Similarly, maybe you don’t have the emotional resources to take the outside view/calm down when triggered, but you’re aware that everyone else will act like you should, and that your socially-accepted options are somewhat constrained. You can still do what feels right in the moment, but it’s not endorsed on a broad scale, and may cost.
it’s fair to note that the bits that go on inside someone’s head often aren’t so much “you have to think X” as they are “you can’t act on ~X if that’s what you’re thinking.”
This framing does bother me less, so that is a fair clarification. However, I don’t think it applies to some of them, particularly:
will not form negative models of other Dragons without giving those Dragons a chance to hear about and interact with them
True. Updated the wording on that one to reflect the real causality (notice negative model --> share it); will look at the others with this lens again soon. Thanks.
Somewhat scattered reactions:
I am really interested to see the result of this experiment.
I think the underlying models are extremely plausible, with the next bullet point as a possible exception.
I am aesthetically very skeptical of phrases like “absolutely reliable” (in Problem 4). I don’t think it’s possible for something to be absolutely reliable, and it seems dangerous/brittle to commit to achieving something unachievable. However, this may be primarily an aesthetic issue, since I think the solution presented in Problem 3 is very sensible.
I don’t buy claim 4, “It does actually require a tyrant”. I agree that it isn’t always possible to achieve consensus. I don’t think that hierarchical authority is the only way to solve that problem. Democratic Centralism is a well-tested alternative, for instance.
I find the code of conduct worrisome, at least as presented. The rules seem likely to encourage hypocrisy and dishonesty, since they make psychologically implausible demands which in many cases are undetectable at time of infraction. This could potentially be mitigated by norms encouraging confession/absolution for sins, but otherwise I expect this to have corrosive effects.
I am totally uninterested in joining the experiment, despite my interest in its outcome. I would be likely to be interested in substantially more time-boxed activities with similar expectations.
“norms encouraging confession/absolution for sins” is a somewhat … connotation-laden … phrase, but that’s a big part of it. For instance, one of the norms I want to build is something surrounding rewarding the admission of a mistake (the cliff there is people starting to get off on making mistakes to get rewarded, but I think we can dodge it), and a MAJOR part of the regular check-ins and circles and pair debugs will be a focus on minimizing the pain and guilt of having slipped up, plus high-status people leading the way by making visible their own flaws and failings.
+1 for noticing and concern. Do you have any concrete tweaks or other suggestions that you think might mitigate?
Also: “absolute” is probably the wrong word, yeah. What I’m gesturing toward is the qualitative difference between 99% and 99.99%.
There’s definitely a qualitative shift for me when something moves from “This is very likely to happen” to “This is a fact in the future and I’ll stop wondering whether it’ll happen.”
While I think it’s good to remember that 0 and 1 are not probabilities, I also think it’s worthwhile to remember that in a human being they can be implemented as something kind of like probabilities. (Otherwise Eliezer’s post wouldn’t have been needed!) Even if in a Bayesian framework we’re just moving the probability beyond some threshold (like Duncan’s 99.99%), it feels to me like a discrete shift to dropping the question about whether it’ll happen.
I think that’s a fine time to use a word like “absolute”, even if only aesthetically.
Yeah, there’s some switch from “am maintaining uncertainty” to “am willing to be certain and absorb the cost of an unpleasant surprise.” Or from “would not be surprised by failure” to “have decided to be surprised by failure.”
Those sound like good ideas for mitigating the corrosive effects I’m worried about.
My personal aesthetic vastly prefers opportunity framings over obligation framings, so my hypothetical version of the dragon army would present things as ideals to aspire to, rather than a code that must not be violated. (Eliezer’s Twelve Virtues of Rationality might be a reasonable model.) I think this would have less chance of being corrosive in the way I’m concerned about. However, for the same reason, it would likely have less force.
Re: absolute. I agree that there can be a qualitative difference between 99% and 99.99%. However, I’m skeptical of systems that require 99.99% reliability to work. Heuristically, I expect complex systems to be stable only if they are highly fault-tolerant and degrade gracefully. (Again, this may still be just an aesthetic difference, since your proposed system does seem to have fault-tolerance and graceful degradation built in.)
On the other hand… look at what happens when you simply demand that level of reliability, put in the effort, and get it. From my engineering perspective, that difference looks huge. And it doesn’t stop at 99.99%; the next couple nines are useful too! The level of complexity and usefulness you can build from those components is breathtaking. It’s what makes the 21st century work.
I’d be really curious to see what happens when that same level of uncompromising reliability is demanded of social systems. Maybe it doesn’t work, maybe the analogy fails. But I want to see the answer!
Who exactly will be doing the demanding and what would be price for not delivering?
Authoritarian systems are often capable of delivering short-term reliability by demanding the head of everyone who fails (“making the trains run on time”). Of course pretty soon they are left without any competent professionals.
Do you have examples of systems that reach this kind of reliabilty internally?
Most high-9 systems work by taking lots of low-9 components, and relying on not all of them failing at the same time. I.e. if you have 10 95% systems that fail completely independently, and you only need one of them to work, that gets you like eleven nines (99.9{11}%).
Expecting a person to be 99% reliable is ridiculous. That’s like two sick days per year, ignoring all other possible causes of failing to make a task. Instead you should build systems and organisations that have slack, so that one person failing at a particular point in time doesn’t make a project/org fail.
Well, in general, I’d say achieving that reliability through redundant means is totally reasonable, whether in engineering or people-based systems.
At a component level? Lots of structural components, for example. Airplane wings stay attached at fairly high reliability, and my impression is that while there is plenty of margin in the strength of the attachment, it’s not like the underlying bolts are being replaced because they failed with any regularity.
I remember an aerospace discussion about a component (a pressure switch, I think?). NASA wanted documentation for 6 9s of reliability, and expected some sort of very careful fault tree analysis and testing plan. The contractor instead used an automotive component (brake system, I think?), and produced documentation of field reliability at a level high enough to meet the requirements. Definitely an example where working to get the underlying component that reliable was probably better than building complex redundancy on top of an unreliable component.
Yeah. I’ve got a couple brilliant and highly capable friends/allies/advisors who also STRONGLY prefer opportunity framings over obligation framings. I think that’s one of the things where the pendulum has overcorrected, though—I think the rationality community as a whole is rather correctly allergic to obligation framings, because of bad experiences with badly made obligations in the past, but I think we’re missing out on an important piece of the puzzle. You can run a successful thing that’s, like, “we’ll do this every week for twelve weeks, show up as much as you like!” and you can run a successful thing that’s, like, “we’ll do this if we get enough people to commit for twelve weeks!” and I think the two styles overlap but there’s a LOT of non-overlap, and the Bay Area rationalists are missing half of that.
I actually totally buy this. There are some things where you just have to commit, and accept the obligations that come with that.
My hesitation primarily comes from the fact that the code of conduct seems intended to be pervasive. It even has requirements that happen entirely inside your own mind. These seem like bad features for an obligation-based system.
My model is that obligation-based systems work best when they’re concrete and specific, and limited to specific times and circumstances. “Commit to performing specified activities twice a week for twelve weeks” seems good, while “never have a mental lapse of type x” seems bad.
That makes sense, yeah. I’m hoping the cure comes both from the culture-of-gentleness we referenced above, and the above-board “Yep, we’re trying to restructure our thinking here” and people choosing intelligently whether to opt in or opt out.
Good place to keep an eye out for problems, though. Yellow flag.
Edit: also, it’s fair to note that the bits that go on inside someone’s head often aren’t so much “you have to think X” as they are “you can’t act on ~X if that’s what you’re thinking.” Like, the agreement that, however frustrated you might FEEL about the fact that people were keeping you up, you’re in a social contract not to VENT at them, if you didn’t first ask them to stop. Similarly, maybe you don’t have the emotional resources to take the outside view/calm down when triggered, but you’re aware that everyone else will act like you should, and that your socially-accepted options are somewhat constrained. You can still do what feels right in the moment, but it’s not endorsed on a broad scale, and may cost.
This framing does bother me less, so that is a fair clarification. However, I don’t think it applies to some of them, particularly:
True. Updated the wording on that one to reflect the real causality (notice negative model --> share it); will look at the others with this lens again soon. Thanks.