Here are the Latest Posts I see on my front page and how I feel about them (if I read them, what I remember, liked or disliked, if I didn’t read them, my expectations and prejudices)
Shallow review of live agendas in alignment & safety: I think this is a pretty good overview, I’ve heard that people in the field find these useful. I haven’t gotten much out of it yet, but I will probably refer to it or point others to it in the future. (I made a few very small contributions to the post)
Social Dark Matter: I read this a week or so ago. I think I remember the following idea: “By behaving in ways that seem innocuous to me but make some people not feel safe around me, I may be filtering information, and therefore underestimating the prevalence of a lot of phenomena in society”. This seems true and important, but I haven’t actually spent time thinking about how to apply it to my life, e.g. thinking about what information I may be filtering.
The LessWrong 2022 Review: I haven’t read this post. Thinking about it now does makes me want to review some posts if I find the time :-)
Deep Forgetting & Unlearning for Safely-Scoped LLMs: I skimmed this, and I agree that this is a promising direction for research, both because of the direct applications and because I want a better scientific understanding of the “deep” in the title. I’ve talked about unlearning something like once every 10 days for the past month and a half, so I expect to talk about it in the future. When I do I’ll likely link to this.
Speaking to Congressional staffers about AI risk: I read this dialogue earlier today and enjoyed it. Things I think I remember (not checking): staffers are more open-minded than you might expect + would love to speak to technical people, people overestimate how much “inside game” is happening, it would be better if DC AI-X-risk related people just blurted out what they think but also it’s complicated, Akash thought Master of the Senate was useful to understand Congress (even though it took place decades ago!).
We’re all in this together: Haven’t read and don’t expect to read. I don’t feel excited about Orthogonal’s work and don’t share EDIT: agree with my understanding of their beliefs. This being said I haven’t put work into understanding their worldview, I couldn’t pass Tamsin’s ITT, seems there would be a lot of distance to bridge. So I’m mostly going off vibes and priors here, which is a bit sad.
On ‘Responsible Scaling Policies’ (RSPs): Haven’t read yet but will probably do so, as I want to have read almost everything there is to read about RSPs. While I’ve generally enjoyed Zvi’s AI posts, I’m not sure they have been useful to me.
Studying The Alien Mind: Haven’t read and probably will not read. I expect the post to contain a couple of interesting bits of insight, but to be long and not clearly written. Here too I’m mostly going off vibes and priors.
A Socratic dialogue with my student: Haven’t read and probably won’t read. I think I wasn’t a fan of some past lsurs posts, so I don’t feel excited about reading a Socratic dialogue between them and their student.
**In defence of Helen Toner, Adam D’Angelo, and Tasha McCauley**: I read this earlier today, and thought it made some interesting points. I don’t know enough about the situation to know if I buy the claims (eg is it now clear that sama was planning a coup of his own? do I agree with his analysis of sama’s character?)
[Valence series] 1. Introduction: Haven’t read it, but it seems interesting. I would like to better understand Steve Byrnes’ views since I’ve generally found his comments thoughtful.
I think a pattern is that there is a lot of content on LessWrong that:
I enjoy reading,
Is relevant to things that I care about,
Doesn’t legibly provide more than temporary value: I forget it quickly, I can’t remember it affecting my decisions, don’t recall helping a friend by pointing to it.
The devil may be in “legibly” here, eg maybe I’m getting a lot out of reading LW in diffuse ways that I can’t pin down concretely, but I doubt it. I think I should spend less time consuming LessWrong, and maybe more time commenting, posting, or dialoguing here.
I think dialogues are a great feature, because:
I generally want people who disagree to talk to each other more, in places that are not Twitter. I expect some dialogues to durably change my mind on important topics.
I think I could learn things from participating in dialogues, and the bar to doing so feels lower to me than the bar to writing a post.
ETA: I’ve been surprised recently by how many dialogues have specifically been about questions I had thought and been a bit confused about, such as originality vs correctness, or grokking complex systems.
Here are the Latest Posts I see on my front page and how I feel about them (if I read them, what I remember, liked or disliked, if I didn’t read them, my expectations and prejudices)
Shallow review of live agendas in alignment & safety: I think this is a pretty good overview, I’ve heard that people in the field find these useful. I haven’t gotten much out of it yet, but I will probably refer to it or point others to it in the future. (I made a few very small contributions to the post)
Social Dark Matter: I read this a week or so ago. I think I remember the following idea: “By behaving in ways that seem innocuous to me but make some people not feel safe around me, I may be filtering information, and therefore underestimating the prevalence of a lot of phenomena in society”. This seems true and important, but I haven’t actually spent time thinking about how to apply it to my life, e.g. thinking about what information I may be filtering.
The LessWrong 2022 Review: I haven’t read this post. Thinking about it now does makes me want to review some posts if I find the time :-)
Deep Forgetting & Unlearning for Safely-Scoped LLMs: I skimmed this, and I agree that this is a promising direction for research, both because of the direct applications and because I want a better scientific understanding of the “deep” in the title. I’ve talked about unlearning something like once every 10 days for the past month and a half, so I expect to talk about it in the future. When I do I’ll likely link to this.
Speaking to Congressional staffers about AI risk: I read this dialogue earlier today and enjoyed it. Things I think I remember (not checking): staffers are more open-minded than you might expect + would love to speak to technical people, people overestimate how much “inside game” is happening, it would be better if DC AI-X-risk related people just blurted out what they think but also it’s complicated, Akash thought Master of the Senate was useful to understand Congress (even though it took place decades ago!).
How do you feel about LessWrong these days?: I’m here! Good to ask for feedback.
We’re all in this together: Haven’t read and don’t expect to read. I don’t feel excited about Orthogonal’s work and don’t
shareEDIT: agree with my understanding of their beliefs. This being said I haven’t put work into understanding their worldview, I couldn’t pass Tamsin’s ITT, seems there would be a lot of distance to bridge. So I’m mostly going off vibes and priors here, which is a bit sad.On ‘Responsible Scaling Policies’ (RSPs): Haven’t read yet but will probably do so, as I want to have read almost everything there is to read about RSPs. While I’ve generally enjoyed Zvi’s AI posts, I’m not sure they have been useful to me.
EA Infrastructure Fund’s Plan to Focus on Principles-First EA: I read this quickly, like an hour ago, and felt vaguely good about it, as we say around here.
Studying The Alien Mind: Haven’t read and probably will not read. I expect the post to contain a couple of interesting bits of insight, but to be long and not clearly written. Here too I’m mostly going off vibes and priors.
A Socratic dialogue with my student: Haven’t read and probably won’t read. I think I wasn’t a fan of some past lsurs posts, so I don’t feel excited about reading a Socratic dialogue between them and their student.
**In defence of Helen Toner, Adam D’Angelo, and Tasha McCauley**: I read this earlier today, and thought it made some interesting points. I don’t know enough about the situation to know if I buy the claims (eg is it now clear that sama was planning a coup of his own? do I agree with his analysis of sama’s character?)
Neural uncertainty estimation review article (for alignment): Haven’t read it, just now skimmed to see what the post is about. I’m familiar with most of the content already so don’t expect to read it. Seems like a good review I might point others to, along with eg CAIS’s course.
[Valence series] 1. Introduction: Haven’t read it, but it seems interesting. I would like to better understand Steve Byrnes’ views since I’ve generally found his comments thoughtful.
I think a pattern is that there is a lot of content on LessWrong that:
I enjoy reading,
Is relevant to things that I care about,
Doesn’t legibly provide more than temporary value: I forget it quickly, I can’t remember it affecting my decisions, don’t recall helping a friend by pointing to it.
The devil may be in “legibly” here, eg maybe I’m getting a lot out of reading LW in diffuse ways that I can’t pin down concretely, but I doubt it. I think I should spend less time consuming LessWrong, and maybe more time commenting, posting, or dialoguing here.
I think dialogues are a great feature, because:
I generally want people who disagree to talk to each other more, in places that are not Twitter. I expect some dialogues to durably change my mind on important topics.
I think I could learn things from participating in dialogues, and the bar to doing so feels lower to me than the bar to writing a post.
ETA: I’ve been surprised recently by how many dialogues have specifically been about questions I had thought and been a bit confused about, such as originality vs correctness, or grokking complex systems.
ETA: I like the new emojis.