(Treating this as non-rhetorical, and making an effort here to say my true reasons rather than reasons which I endorse or which make me look good...)
In order of importance, starting from the most important:
It would take a lot of effort to turn the list of disagreements I wrote for myself into a proper post, and I decided the effort wasn’t worth it. I’m impressed how quickly Paul wrote this response, and it wouldn’t surprise me if there are some people reading this who are now wondering if they should still post their rebuttals they’ve been drafting for the last week.
As someone without name recognition, I have a general fear—not unfounded, I think—of posting my opinions on alignment publicly, lest they be treated as the ramblings of a self-impressed newcomer with a shallow understanding of the field.[1] Some important context is that I’m a math grad student in the process of transitioning into a career in alignment, so I’m especially sensitive right now about safeguarding my reputation.
I expected (rightly) that someone more established than me would end up posting a rebuttal better than mine.
General anxiety around posting my thoughts (what if my ideas are dumb? what if no one takes them seriously? etc)
My inside view was that List of Lethalities was somewhere between unhelpful and anti-helpful, and I was pretty mad about it.[2] I worried that if I tried to draft a reply, it would come across as angrier than I reflectively endorse. (And this would have also carried reputational costs.)
And finally, one reason which wasn’t really a big deal, maybe like 1% of my hesitance, but which I’ll include just because I think it makes a funny story:
6. This coming spring I’ll be teaching a Harvard math dept course on MIRI-style decision theory[3]. I had in mind that I might ask you (Eliezer) if you wanted to give a guest lecture. But I figured you probably wouldn’t be interested in doing so if you knew me as “the unpleasant-seeming guy who wrote an angry list of all the reasons List of Lethalities was dumb,” so.
Some miscellaneous related thoughts: - LessWrong does have a fair number of posts these days which I’d categorize as “ramblings by someone with a shallow understanding of alignment,” so I don’t begrudge anyone for starting out with a prior that mine is one such.
- Highly public discussions like the one launched by List of Lethalities seem more likely to attract such posts, relative to narrower discussions on more niche topics. This makes me especially reticent to publicly opine on discussions like this one.
On the morning after List of Lethalities was published, a friend casually asked how I was doing. I replied, “I wish I had a mood ring with a color for ‘mad at Eliezer Yudkowsky’ because then you wouldn’t have to ask me how I’m doing.”
Given the context, I should clarify that my inside-view doesn’t actually expect MIRI-style decision theory to be useful towards alignment; my motivation for teaching a course on the topic is just that it seems fun and was easy to plan.
(Treating this as non-rhetorical, and making an effort here to say my true reasons rather than reasons which I endorse or which make me look good...)
In order of importance, starting from the most important:
It would take a lot of effort to turn the list of disagreements I wrote for myself into a proper post, and I decided the effort wasn’t worth it. I’m impressed how quickly Paul wrote this response, and it wouldn’t surprise me if there are some people reading this who are now wondering if they should still post their rebuttals they’ve been drafting for the last week.
As someone without name recognition, I have a general fear—not unfounded, I think—of posting my opinions on alignment publicly, lest they be treated as the ramblings of a self-impressed newcomer with a shallow understanding of the field.[1] Some important context is that I’m a math grad student in the process of transitioning into a career in alignment, so I’m especially sensitive right now about safeguarding my reputation.
I expected (rightly) that someone more established than me would end up posting a rebuttal better than mine.
General anxiety around posting my thoughts (what if my ideas are dumb? what if no one takes them seriously? etc)
My inside view was that List of Lethalities was somewhere between unhelpful and anti-helpful, and I was pretty mad about it.[2] I worried that if I tried to draft a reply, it would come across as angrier than I reflectively endorse. (And this would have also carried reputational costs.)
And finally, one reason which wasn’t really a big deal, maybe like 1% of my hesitance, but which I’ll include just because I think it makes a funny story:
6. This coming spring I’ll be teaching a Harvard math dept course on MIRI-style decision theory[3]. I had in mind that I might ask you (Eliezer) if you wanted to give a guest lecture. But I figured you probably wouldn’t be interested in doing so if you knew me as “the unpleasant-seeming guy who wrote an angry list of all the reasons List of Lethalities was dumb,” so.
Some miscellaneous related thoughts:
- LessWrong does have a fair number of posts these days which I’d categorize as “ramblings by someone with a shallow understanding of alignment,” so I don’t begrudge anyone for starting out with a prior that mine is one such.
- Highly public discussions like the one launched by List of Lethalities seem more likely to attract such posts, relative to narrower discussions on more niche topics. This makes me especially reticent to publicly opine on discussions like this one.
On the morning after List of Lethalities was published, a friend casually asked how I was doing. I replied, “I wish I had a mood ring with a color for ‘mad at Eliezer Yudkowsky’ because then you wouldn’t have to ask me how I’m doing.”
Given the context, I should clarify that my inside-view doesn’t actually expect MIRI-style decision theory to be useful towards alignment; my motivation for teaching a course on the topic is just that it seems fun and was easy to plan.