I mean, that’s what the approach of ruling out rather than actively adding is meant to achieve.
I won’t be able to say the right words for A and B simultaneously (sometimes) but I can usually draw a boundary and say “look, it’s within this boundary” while conclusively/convincingly ruling out [bad thing A doesn’t like] and [bad thing B doesn’t like].
There’s that bit of advice about how you don’t really understand a thing unless you can explain it to a ten-year-old. I’d guess there’s something analogous here, in that if you can’t speak to both audiences at once, then you’re either a) actually holding a position that’s genuinely contra one of those groups, and just haven’t come to terms with that, or b) just need more practice.
If you want to give a real example of a position you have a hard time expressing, I could try my hand at a genuine attempt to explain it while dodging the failure modes.
It might be that you’re assuming more shared context between the groups than I am. In my case, I’m usually thinking of LessWrongers and ML researchers as the two groups. One example would be that the word “reward” is interpreted very differently by the two.
To LessWrongers, there’s a difference between “reward” and “utility”, where “reward” refers to a signal that is computed over observations and is subject to wireheading, and “utility” refers to a function that is computed over states and is not subject to wireheading. (See e.g. this paper, though I don’t remember if it actually uses the terminology, or if that came later).
Whereas to ML researchers, this is not the sort of distinction they usually make, and instead “reward” is roughly that-which-you-optimize. In most situations that ML researchers consider, “does the reward lead to wireheading” does not have a well-defined answer.
In the ML context, I might say something like “reward learning is one way that we can avoid the problem of specification gaming”. When (some) LessWrongers read this, they think I am saying that the thing-that-leads-to-wireheading is a good solution to specification gaming, which they obviously disagree with, whereas in the context I was operating in, I meant to say “we should learn rather hardcode that-which-you-optimize” without making claims about whether the learned thing was of the type-that-leads-to-wireheading or not.
I definitely could write all of this out every time I want to use the word “reward”, but this would be (a) incredibly tedious for me and (b) off-putting to my intended readers, if I’m spending all this time talking about some other interpretation that is completely alien to them.
It being incredibly tedious for you is part of the territory; situations in which you get to say your piece but not have to do a tedious thing are fabricated options.
You’re almost certainly correct that it’s nonzero/substantially off-putting to your readers, but I would bet at 5:1 odds that it’s still less costly than the otherwise-inevitable-according-to-your-models misunderstandings, or just not communicating your points at all. Perhaps there’s value to be found spending a full, concentrated, effortful hour looking for a new, short, memorable phrase that hits the right note for both groups? And then you can consistently re-use this new, short, memorable phrase?
I’m not disagreeing with you that this is hard. I think the thing that both Fabricated Options and the above essay were trying to point at is “yeah, sometimes the really hard-looking thing is nevertheless the actual best option.”
Yeah, I figured that was probably the case. Still seemed worth checking.
You’re almost certainly correct that it’s nonzero/substantially off-putting to your readers, but I would bet at 5:1 odds that it’s still less costly than the otherwise-inevitable-according-to-your-models misunderstandings
I’m not entirely sure what the claim that you’re putting odds on is, but usually, in my situation:
I write different pieces for different audiences
I promote the writing to the audience that I wrote for
I predict, but haven’t checked, that significantly less of the intended audience would read it if I couldn’t simply use the accepted jargon for that audience and instead had to explain it / rule out everything else
I find that the audience I didn’t promote it to mostly doesn’t read it (limiting the number of misunderstandings).
So I think I’m in the position where it makes sense to take cheap actions to avoid misunderstandings, but not expensive ones. I also feel constrained by the two groups having very different (effective) norms, e.g. in ML it’s a lot more important to be concise, and it’s a lot more weird (though maybe not bad?) to propose new short phrases for existing concepts.
One benefit of blog posts is the ability to footnote terms that might be contentious. Saying “reward[1]...” and then 1: for Less Wrong visitors, “reward” in this context means … clarifies for anyone who needs it/might want to respond while letting the intended audience gloss over the moat and read your point with the benefit of jargon.
I mean, that’s what the approach of ruling out rather than actively adding is meant to achieve.
I won’t be able to say the right words for A and B simultaneously (sometimes) but I can usually draw a boundary and say “look, it’s within this boundary” while conclusively/convincingly ruling out [bad thing A doesn’t like] and [bad thing B doesn’t like].
There’s that bit of advice about how you don’t really understand a thing unless you can explain it to a ten-year-old. I’d guess there’s something analogous here, in that if you can’t speak to both audiences at once, then you’re either a) actually holding a position that’s genuinely contra one of those groups, and just haven’t come to terms with that, or b) just need more practice.
If you want to give a real example of a position you have a hard time expressing, I could try my hand at a genuine attempt to explain it while dodging the failure modes.
It might be that you’re assuming more shared context between the groups than I am. In my case, I’m usually thinking of LessWrongers and ML researchers as the two groups. One example would be that the word “reward” is interpreted very differently by the two.
To LessWrongers, there’s a difference between “reward” and “utility”, where “reward” refers to a signal that is computed over observations and is subject to wireheading, and “utility” refers to a function that is computed over states and is not subject to wireheading. (See e.g. this paper, though I don’t remember if it actually uses the terminology, or if that came later).
Whereas to ML researchers, this is not the sort of distinction they usually make, and instead “reward” is roughly that-which-you-optimize. In most situations that ML researchers consider, “does the reward lead to wireheading” does not have a well-defined answer.
In the ML context, I might say something like “reward learning is one way that we can avoid the problem of specification gaming”. When (some) LessWrongers read this, they think I am saying that the thing-that-leads-to-wireheading is a good solution to specification gaming, which they obviously disagree with, whereas in the context I was operating in, I meant to say “we should learn rather hardcode that-which-you-optimize” without making claims about whether the learned thing was of the type-that-leads-to-wireheading or not.
I definitely could write all of this out every time I want to use the word “reward”, but this would be (a) incredibly tedious for me and (b) off-putting to my intended readers, if I’m spending all this time talking about some other interpretation that is completely alien to them.
All of that makes sense.
My claims in response:
It being incredibly tedious for you is part of the territory; situations in which you get to say your piece but not have to do a tedious thing are fabricated options.
You’re almost certainly correct that it’s nonzero/substantially off-putting to your readers, but I would bet at 5:1 odds that it’s still less costly than the otherwise-inevitable-according-to-your-models misunderstandings, or just not communicating your points at all. Perhaps there’s value to be found spending a full, concentrated, effortful hour looking for a new, short, memorable phrase that hits the right note for both groups? And then you can consistently re-use this new, short, memorable phrase?
I’m not disagreeing with you that this is hard. I think the thing that both Fabricated Options and the above essay were trying to point at is “yeah, sometimes the really hard-looking thing is nevertheless the actual best option.”
Yeah, I figured that was probably the case. Still seemed worth checking.
I’m not entirely sure what the claim that you’re putting odds on is, but usually, in my situation:
I write different pieces for different audiences
I promote the writing to the audience that I wrote for
I predict, but haven’t checked, that significantly less of the intended audience would read it if I couldn’t simply use the accepted jargon for that audience and instead had to explain it / rule out everything else
I find that the audience I didn’t promote it to mostly doesn’t read it (limiting the number of misunderstandings).
So I think I’m in the position where it makes sense to take cheap actions to avoid misunderstandings, but not expensive ones. I also feel constrained by the two groups having very different (effective) norms, e.g. in ML it’s a lot more important to be concise, and it’s a lot more weird (though maybe not bad?) to propose new short phrases for existing concepts.
One benefit of blog posts is the ability to footnote terms that might be contentious. Saying “reward[1]...” and then 1: for Less Wrong visitors, “reward” in this context means … clarifies for anyone who needs it/might want to respond while letting the intended audience gloss over the moat and read your point with the benefit of jargon.
True! I might try that strategy more deliberately in the future.