The first time I read this, I think my top-level personal takeaway was: ‘Woah, this is complicated. I can barely follow the structure of some of these sentences in section 7, and I definitely don’t feel like I’ve spent enough time meditating on my counterfactual selves’ preferences or cultivating a wizard’s metacognitive habits to be able to apply this framework in a really principled way. This hard-to-discuss topic seems like even more of a minefield now.′
My takeaways are different on a second read:
1. Practicing thinking and talking like a wizard seems really, really valuable. (See Honesty: Beyond Internal Truth.) Bells and whistles like “coming up with an iron-clad approach to Glomarization” seem much less important, and shouldn’t get in the way of core skill-building. It really does seem to make me healthier when I’m in the mindset of treating “always speak literal truth” as my fallback, complicated meta-honesty schemes aside.
2. It’s possible to do easier modified versions of the thing Eliezer’s talking about. E.g., if I’m worried that I’m not experienced or fast enough on my feet to have a meta-honesty talk in real time, there’s nothing stopping me from opting out of those conversations and requesting that people take in-person queries about honesty to email, where I can spend more time thinking through my responses. This post is a useful thing I can link to to explain why a normal person who’s trying to do the right thing might want to be really careful about how they discuss this issue.
3. The name Eliezer rejected, “Bayesian honesty”, seems closer to the aspect of this proposal that’s useful to me to focus on. In my head, the “meta-honesty” concept currently feels sort of finicky / defensive / rules-lawyery, maybe because of the particular vivid examples Eliezer used in his post. I’m finding it more useful to keep my focus on the central idea of ‘in cases where I don’t want to improve others’ models, at least try not to give others Bayesian evidence for falsehoods’. It then naturally falls out of this focus that I’ll want to share information about myself that will give people an accurate prior about which kinds of hypotheses my statements provide evidence for. It’s generally useful and prosocial to give allies and people in my community that sort of background model of how I tend to think and act, whether under the “meta-honesty” frame or other frames.
The first time I read this, I think my top-level personal takeaway was: ‘Woah, this is complicated. I can barely follow the structure of some of these sentences in section 7, and I definitely don’t feel like I’ve spent enough time meditating on my counterfactual selves’ preferences or cultivating a wizard’s metacognitive habits to be able to apply this framework in a really principled way. This hard-to-discuss topic seems like even more of a minefield now.′
My takeaways are different on a second read:
1. Practicing thinking and talking like a wizard seems really, really valuable. (See Honesty: Beyond Internal Truth.) Bells and whistles like “coming up with an iron-clad approach to Glomarization” seem much less important, and shouldn’t get in the way of core skill-building. It really does seem to make me healthier when I’m in the mindset of treating “always speak literal truth” as my fallback, complicated meta-honesty schemes aside.
2. It’s possible to do easier modified versions of the thing Eliezer’s talking about. E.g., if I’m worried that I’m not experienced or fast enough on my feet to have a meta-honesty talk in real time, there’s nothing stopping me from opting out of those conversations and requesting that people take in-person queries about honesty to email, where I can spend more time thinking through my responses. This post is a useful thing I can link to to explain why a normal person who’s trying to do the right thing might want to be really careful about how they discuss this issue.
3. The name Eliezer rejected, “Bayesian honesty”, seems closer to the aspect of this proposal that’s useful to me to focus on. In my head, the “meta-honesty” concept currently feels sort of finicky / defensive / rules-lawyery, maybe because of the particular vivid examples Eliezer used in his post. I’m finding it more useful to keep my focus on the central idea of ‘in cases where I don’t want to improve others’ models, at least try not to give others Bayesian evidence for falsehoods’. It then naturally falls out of this focus that I’ll want to share information about myself that will give people an accurate prior about which kinds of hypotheses my statements provide evidence for. It’s generally useful and prosocial to give allies and people in my community that sort of background model of how I tend to think and act, whether under the “meta-honesty” frame or other frames.