A tricky thing about feedback on LW (or maybe just human nature or webforum nature):
Post: Maybe there’s a target out there let’s all go look (50 points)
Comments: so inspiring! We should all go look!
Post: What “target” really means (100 points)
Comments: I feel much less confused, thank you
Post: I shot an arrow at the target (5 points)
Comments: bro you missed
Post: Target probably in the NW cavern in the SE canyon (1 point)
Comments: doubt it
Post: Targets and arrows—a fictional allegory (500 points)
Comments: I am totally Edd in this story
Post: I hit the target. Target is dead. I have the head. (40 points)
Comments: thanks. cool.
Basically, if you try to actually do a thing or be particularly specific/concrete then you are held to a much higher standard.
There are some counterexamples. And LW is better than lots of sites.
Nonetheless, I feel here like I have a warm welcome to talk bullshit around the water cooler but angry stares when I try to mortar a few bricks.
I feel like this is almost a good site for getting your hands dirty and getting feedback and such. Just a more positive culture towards actual shots on target would be sufficient I think. Not sure how that could be achieved.
Maybe this is like publication culture vs workshop culture or something.
I think you’re directionally correct and would like to see lesswrong reward concrete work more. But I think your analysis is suffering from survivorship bias. Lots of “look at the target” posts die on the vine so you never see their low karma, and decent arrow-shot posts tend to get more like 50 even when the comments section is empty.
I think a large cause might be that posts talking about the target are more accessible to a larger number of people. Posts like List of Lethalities are understandable to people who aren’t alignment researchers, while something the original Latent Adversarial Training post (which used to be my candidate for the least votes:promising ratio post) is mostly relevant to and understandable by people who think about inner alignment or adversarial robustness. This is to say nothing of posts with more technical content.
This seems like an issue with the territory that there are far more people who want to read things about alignment than people who work on alignment. The LW admins already try to counter similar effects by maintaining high walls for the garden, and with the karma-weighted voting system. On the other hand, it’s not clear that pushing along those dimensions would make this problem better; plausibly you need slightly different mechanisms to account for this. The Alignment Forum sort-of seems like something that tries to address this: more vote balancing between posts about targets and posts about attempts to reach it because of the selection effect.
This doesn’t fully address the problem, and I think you were trying to point out the effects of not accounting for some topics having epistemic standards that are easier to meet than others, even when the latter is arguably more valuable. I think it’s plausible that’s more important, but there are other ways to improve it as well[1].
When I finished writing, I realized that what you were pointing out is also somewhat applicable to this comment. You point out a problem, and focus on one cause that’s particularly large and hard to solve. I write a comment about another cause that’s plausibly smaller but easier to solve, because that meets an easier epistemic standard than failing at solving the harder problem.
One thing that might be cofounding it slightly is that (depending on the target) the reward for actually taking home targets might not be LW karma but something real. So the “I have the head” only gives 40 karma. But the “head” might well also be worth something in the real world, like if its some AI code toy example that does something cool it might lead to a new job. Or if its something more esoteric like “meditation technique improves performance at work” then you get the performance boost.
Data point: my journeyman posts on inconclusive lit reviews get 40-70 karma (unless I make a big claim and then retract it. Those both got great numbers). But I am frequently approached to do lit reviews, and I have to assume the boring posts no one comments on contribute to the reputation that attracts those.
Strong agree. I think this is because in the rest of the world, framing is a higher status activity than filling, so independent thinkers gravitate towards the higher-status activity of framing.
Or independent thinkers try to find new frames because the ones on offer are insufficient? I think this is roughly what people mean when they say that AI is “pre-paradigmatic,” i.e., we don’t have the frames for filling to be very productive yet. Given that, I’m more sympathetic to framing posts on the margin than I am to filling ones, although I hope (and expect) that filling-type work will become more useful as we gain a better understanding of AI.
This response is specific to AI/AI alignment, right? I wasn’t “sub-tweeting” the state of AI alignment, and was more thinking of other endeavours (quantified self, paradise engineering, forecasting research).
In general, the bias towards framing can be swamped by other considerations.
I see this as occurring with various pieces of Infrabayesianism, like Diffractor’s UDT posts. They’re dense enough mathematically (hitting the target) which makes them challenging to read… and then also challenging to discuss. There are fewer comments even from the people who read the entire post because they don’t feel competent enough to make useful commentary (with some truth behind that feeling); the silence also further making commentation harder. At least that’s what I’ve noticed in myself, even though I enjoy & upvote those posts.
Less attention seems natural because of specialization into cognitive niches, not everyone has read all the details of SAEs, or knows all the mathematics referenced in certain agent foundations posts. But it does still make it a problem in socially incentivizing good research.
I don’t know if there are any great solutions. More up-weighting for research-level posts? I view the distillation idea from a ~year ago as helping with drawing attention towards strong (but dense) posts, but it appeared to die down. Try to revive that more?
It’s a lot easier to signal the kind of intelligence that LessWrong values-in-practice by writing a philosophical treatise than by actually accomplishing something.
A tricky thing about feedback on LW (or maybe just human nature or webforum nature):
Post: Maybe there’s a target out there let’s all go look (50 points)
Comments: so inspiring! We should all go look!
Post: What “target” really means (100 points)
Comments: I feel much less confused, thank you
Post: I shot an arrow at the target (5 points)
Comments: bro you missed
Post: Target probably in the NW cavern in the SE canyon (1 point)
Comments: doubt it
Post: Targets and arrows—a fictional allegory (500 points)
Comments: I am totally Edd in this story
Post: I hit the target. Target is dead. I have the head. (40 points)
Comments: thanks. cool.
Basically, if you try to actually do a thing or be particularly specific/concrete then you are held to a much higher standard.
There are some counterexamples. And LW is better than lots of sites.
Nonetheless, I feel here like I have a warm welcome to talk bullshit around the water cooler but angry stares when I try to mortar a few bricks.
I feel like this is almost a good site for getting your hands dirty and getting feedback and such. Just a more positive culture towards actual shots on target would be sufficient I think. Not sure how that could be achieved.
Maybe this is like publication culture vs workshop culture or something.
Unpolished first thoughts:
Selection effect: people who go to a blog to read bc they like reading, not doing
Concrete things are hard reads, math-heavy posts, doesn’t feel ok to vote when you don’t actually understand
In general easier things have wider audience
Making someone change their mind is more valuable to them than saying you did something?
There are many small targets and few big ideas/frames, votes are distributed proportionally
It’s not perfect, but one approach I saw on here and liked a lot was @turntrout’s MATS team’s approach for some of the initial shard theory work, where they made an initial post outlining the problem and soliciting predictions on a set of concrete questions (which gave a nice affordance for engagement, namely “make predictions and maybe comment on your predictions), and then they made a follow-up post with their actual results. Seemed to get quite good engagement.
A confounding factor, though, was that was also an unusually impressive bit of research.
At least as far as safety research goes, concrete empirical safety research is often well received.
I think you’re directionally correct and would like to see lesswrong reward concrete work more. But I think your analysis is suffering from survivorship bias. Lots of “look at the target” posts die on the vine so you never see their low karma, and decent arrow-shot posts tend to get more like 50 even when the comments section is empty.
I think a large cause might be that posts talking about the target are more accessible to a larger number of people. Posts like List of Lethalities are understandable to people who aren’t alignment researchers, while something the original Latent Adversarial Training post (which used to be my candidate for the least votes:promising ratio post) is mostly relevant to and understandable by people who think about inner alignment or adversarial robustness. This is to say nothing of posts with more technical content.
This seems like an issue with the territory that there are far more people who want to read things about alignment than people who work on alignment. The LW admins already try to counter similar effects by maintaining high walls for the garden, and with the karma-weighted voting system. On the other hand, it’s not clear that pushing along those dimensions would make this problem better; plausibly you need slightly different mechanisms to account for this. The Alignment Forum sort-of seems like something that tries to address this: more vote balancing between posts about targets and posts about attempts to reach it because of the selection effect.
This doesn’t fully address the problem, and I think you were trying to point out the effects of not accounting for some topics having epistemic standards that are easier to meet than others, even when the latter is arguably more valuable. I think it’s plausible that’s more important, but there are other ways to improve it as well[1].
When I finished writing, I realized that what you were pointing out is also somewhat applicable to this comment. You point out a problem, and focus on one cause that’s particularly large and hard to solve. I write a comment about another cause that’s plausibly smaller but easier to solve, because that meets an easier epistemic standard than failing at solving the harder problem.
I certainly see where you are coming from.
One thing that might be cofounding it slightly is that (depending on the target) the reward for actually taking home targets might not be LW karma but something real. So the “I have the head” only gives 40 karma. But the “head” might well also be worth something in the real world, like if its some AI code toy example that does something cool it might lead to a new job. Or if its something more esoteric like “meditation technique improves performance at work” then you get the performance boost.
Data point: my journeyman posts on inconclusive lit reviews get 40-70 karma (unless I make a big claim and then retract it. Those both got great numbers). But I am frequently approached to do lit reviews, and I have to assume the boring posts no one comments on contribute to the reputation that attracts those.
Strong agree. I think this is because in the rest of the world, framing is a higher status activity than filling, so independent thinkers gravitate towards the higher-status activity of framing.
Or independent thinkers try to find new frames because the ones on offer are insufficient? I think this is roughly what people mean when they say that AI is “pre-paradigmatic,” i.e., we don’t have the frames for filling to be very productive yet. Given that, I’m more sympathetic to framing posts on the margin than I am to filling ones, although I hope (and expect) that filling-type work will become more useful as we gain a better understanding of AI.
This response is specific to AI/AI alignment, right? I wasn’t “sub-tweeting” the state of AI alignment, and was more thinking of other endeavours (quantified self, paradise engineering, forecasting research).
In general, the bias towards framing can be swamped by other considerations.
I see this as occurring with various pieces of Infrabayesianism, like Diffractor’s UDT posts. They’re dense enough mathematically (hitting the target) which makes them challenging to read… and then also challenging to discuss. There are fewer comments even from the people who read the entire post because they don’t feel competent enough to make useful commentary (with some truth behind that feeling); the silence also further making commentation harder. At least that’s what I’ve noticed in myself, even though I enjoy & upvote those posts.
Less attention seems natural because of specialization into cognitive niches, not everyone has read all the details of SAEs, or knows all the mathematics referenced in certain agent foundations posts. But it does still make it a problem in socially incentivizing good research.
I don’t know if there are any great solutions. More up-weighting for research-level posts? I view the distillation idea from a ~year ago as helping with drawing attention towards strong (but dense) posts, but it appeared to die down. Try to revive that more?
What was the distillation idea from a year ago?
https://www.lesswrong.com/posts/zo9zKcz47JxDErFzQ/call-for-distillers
It’s a lot easier to signal the kind of intelligence that LessWrong values-in-practice by writing a philosophical treatise than by actually accomplishing something.