Sniffnoy comments on My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage)

Sniffnoy 20 Oct 2021 18:44 UTC
4 points

This seems exactly backwards, if someone makes uncorrelated errors, they are probably unintentional mistakes. If someone makes correlated errors, they are better explained as part of a strategy.

I mean, there is a word for correlated errors, and that word is “bias”; so you seem to be essentially claiming that people are unbiased? I’m guessing that’s probably not what you’re trying to claim, but that is what I am concluding? Regardless, I’m saying people are biased towards this mistake.

Or really, what I’m saying it’s the same sort of phenomenon that Eliezer discusses here. So it could indeed be construed as a strategy as you say; but it would not be a strategy on the part of the conscious agent, but rather a strategy on the part of the “corrupted hardware” itself. Or something like that—sorry, that’s not a great way of putting it, but I don’t really have a better one, and I hope that conveys what I’m getting at.

Like, I think you’re assuming too much awareness/agency of people. A person who makes correlated errors, and is aware of what they are doing, is executing a deliberate strategy. But lots of people who make correlated errors are just biased, or the errors are part of a built-in strategy they’re executing, not deliberately, but by default without thinking about it, that requires effort not to execute.

We should expect someone calling themself a rationalist to be better, obviously, but, IDK, sometimes things go bad?

I can imagine, after reading the sequences, continuing to have this bias in my own thoughts, but I don’t see how I could have been so confused as to refer to it in conversation as a valid principle of epistemology.

I mean people don’t necessarily fully internalize everything they read, and in some people the “hold on what am I doing?” can be weak? <shrug>

I mean I certainly don’t want to rule out deliberate malice like you’re talking about, but neither do I think this one snippet is enough to strongly conclude it.
- Benquo 20 Oct 2021 18:53 UTC
  5 points
  Parent
  In most cases it seems intentional but not deliberate. People will resist pressure to change the pattern, or find new ways to execute it if the specific way they were engaged in this bias is effectively discouraged, but don’t consciously represent to themselves their intent to do it or engage in explicit means-ends reasoning about it.
  - Sniffnoy 26 Oct 2021 6:52 UTC
    4 points
    Parent
    Yeah, that sounds about right to me. I’m not saying that you should assume such people are harmless or anything! Just that, like, you might want to try giving them a kick first—“hey, constant vigilance, remember?” :P—and see how they respond before giving up and treating them as hostile.