I agree that that is an extremely relevant post to my current situation and general demeanor in life.
I guess I’m not willing to declare the alignment problem unsolvable just because it’s difficult, and I’m not willing to let anyone else claim to have solved it before I get to claim that I’ve solved it? And that inherently makes me a crackpot until such time as consensus reality catches up with me or I change my mind about my most deeply held values and priorities.
Are there any other posts from the sequences that you think I should read?
I guess I’m not willing to declare the alignment problem unsolvable just because it’s difficult
I’m not aware of anyone who has declared the alignment problem to be unsolvable. I have read a few people speculate that it MIGHT be unsolvable, but no real concrete attempts to show it is (though I haven’t kept up with the literature as much as some others so perhaps I missed something)
I’m not willing to let anyone else claim to have solved it before I get to claim that I’ve solved it
This just seems weird. If someone else solves alignment, why would you “not let them claim to have solved it”? And how would you do that? By just refusing to recognize it even if it goes against all the available evidence and causes people to take you less seriously?
No, I would do it by rushing to publish my work before it’s been cleaned up enough to be presentable. Scientist have done this throughout history have rushed to avoid getting scooped, and to scoop others. I do not wish to be the Rosalind Franklin of the alignment problem.
Why do you care so much about being first out the door, so much so that your willing to look like a clown/crackpot along the way?
The existing writings, from what I can see, don’t exactly portray the writer as a bonafide genius, so at best folks will perceive you as a moderately above average person with some odd tendencies/preferences, who got unusually lucky.
And then promptly forget about it when the genuine geniuses publish their highly credible results.
And that’s assuming it is even solvable, which seems to be increasingly not the case.
No, it’s not going to get you credit. That’s not how credit works in science or anywhere. It goes not to the first who had the idea, but the first who successfully popularized it. That’s not fair, but that’s how it works.
You can give yourself credit or try to argue for it based on evidence of early publication, but would delaying another day to polish your writing a little matter for being first out the door?
I’m sympathetic to your position here, I’ve struggled with similar questions, including wondering why I’m getting downvoted even after trying to get my tone right, and having what seem to me like important, well-explained contributions.
Recognizing that the system isn’t going to be completely fair or efficient and working with it instead of against it is unfortunate, but it’s the smart thing to do in most situations. Attempts to work outside of the existing system only work when they’re either carefully thought out and based on a thorough understanding of why the system works as it does, or they’re extremely lucky.
Historically, I have been extremely, extremely good at delaying publication of what I felt were capabilites-relevant advances, for essentially Yudkowskyan doomer reasons. The only reward I have earned for this diligence is to be treated like a crank when I publish alignment-related research because I don’t have an extensive history of public contribution to the AI field.
Here is my speculation of what Q* is, along with a github repository that implements a shitty version of it, postdated several months.
Ask yourself: do you want personal credit, or do you want to help save the world?
Anyway, don’t get discouraged, just learn from those answers and keep writing about those ideas. And learning about related ideas so you can reference them and thereby show what’s new in your ideas. You only got severely downvoted on one, don’t let it get to you any more than you can help.
If the ideas are strong, they’ll win through if you keep at it.
I wouldn’t say I really do satire? My normal metier is more “the truth, with jokes”. If I’m acting too crazy to be considered a proper rationalist, it’s usually because I am angry or at least deeply annoyed.
I think “read the sequences” is an incredibly unhelpful suggestion. It’s an unrealistic high bar for entry. The sequences are absolutely massive. It’s like saying “read the whole bible before talking to anyone at church”, but even longer. And many newcomers already understand the vast bulk of that content. Even the more helpful selected sequences are two thousand pages.
We need a better introduction to alignment work, LessWrong community standards, and rationality. Until we have it, we need to personally be more helpful to aspiring community members.
If someone is too wrong, and explicitly refuses to update on feedback, it may be impossible to give them a short condensed argument.
(If someone said that Jesus was a space lizard from another galaxy who came to China 10000 years ago, and then he publicly declared that he doesn’t actually care whether God actually exists or not… which specific chapter of the Bible would you recommend him to read to make him understand that he is not a good fit for a Christian web forum? Merely using the “Jesus” keyword is not enough, if everything substantial is different.)
Well, yes. I guess it’s more of an… expression of frustration. Like telling the space-lizard-Jesus guy: “Dude, have you ever read the Bible?” You don’t expect he did, and yes that is the reason why he says what he says… but you also do not really expect him to read it now.
(Then he asks you for help at publishing his own space Bible.)
Well what if he bets a significant amount of money at 2000:1 odds that the Pope will officially add his space Bible to the real Bible as a third Testament after the New Testament within the span of a year?
What if he records a video of himself doing Bible study? What if he offers to pay people their currently hourly rate to watch him do Bible study?
I guess the thrust of my questions here is, at what point do you feel that you become the dick for NOT helping him publish his own space Bible? At what point are you actively impeding new religious discoveries by failing to engage?
For real, literal Christianity, I think there’s no amount of cajoling or argumentation that could lead a Christian to accept the new space Bible. For one thing, until the Pope signs off on it, they would no longer be Christian if they did.
Does rationalism aspire to be more than just another provably-false religion? What would ET Jaynes say about people who fail to update on new evidence?
Since you explicitly asked for feedback regarding your downvotes, the “oh, woe is me, my views are so unpopular and my posts keep getting downvoted” lamentations you’ve included in a few of your posts get grating, and might end up self-fulfilling. If you’re saying unpopular things, my advice is to own it, and adopt the “haters gonna hate” attitude: ignore the downvotes completely.
(To be clear, we do have automatic regimes that restrict posting and commenting privileges for downvoted users, since we can’t really keep up with the moderation load otherwise, so there are some limits to your ability to ignore them)
Counter, I think the restriction is too loose. There are enough people out there making posts that the real issue is lack of quality, not lack of quantity.
The problem is a long time contributor can be heavily downvoted once and become heavily rate limited, and then it relies on them earning back their points to be able to post again. I wouldn’t say such a thing is necessarily terrible, but it seems to me to have driven away a number of people I was optimistic about who were occasionally saying something many people disagree with and getting heavily downvoted.
I’m not sure I understand this concern. For someone who posts a burst of unpopular (whether for the topic, for the style, or for other reasons) posts, rate limiting seems ideal. It prevents them from digging deeper, while still allowing them to return to positive contribution, and to focus on quality rather than quantity.
I understand it’s annoying to the poster (and I’ve been caught and annoyed myself), but I haven’t seen any that seem like a complete error. I kind of expect the mods would intervene if it were a clear problem, but I also expect the base intervention is advice to slow down.
So yes, “quite a few”, especially if upvotes are scarcer than downvotes for the poster. But remember, during this time, they ARE posting, just not at the quantity that wasn’t working.
The real question is whether the poster actually changes behavior based on the downvotes and throttling. I do think it’s unfortunate that some topics could theoretically be good for LW, but end up not working. I don’t think it’s problematic that many topics and presentation styles are not possible on LW.
My understanding of the current situation with me is that I am not in fact rate-limited purely by automatic processes currently, but rather by some sort of policy decision on the part of LessWrong’s moderators.
Which is fine, I’ll just continue to post my alignment research on my substack, and occasionally dump linkposts to them in my shortform, which the mods have allowed me continued access to.
I think a generic answer is “read the sequences”? Here’s a fun one
https://www.lesswrong.com/posts/qRWfvgJG75ESLRNu9/the-crackpot-offer
I agree that that is an extremely relevant post to my current situation and general demeanor in life.
I guess I’m not willing to declare the alignment problem unsolvable just because it’s difficult, and I’m not willing to let anyone else claim to have solved it before I get to claim that I’ve solved it? And that inherently makes me a crackpot until such time as consensus reality catches up with me or I change my mind about my most deeply held values and priorities.
Are there any other posts from the sequences that you think I should read?
I’m not aware of anyone who has declared the alignment problem to be unsolvable. I have read a few people speculate that it MIGHT be unsolvable, but no real concrete attempts to show it is (though I haven’t kept up with the literature as much as some others so perhaps I missed something)
This just seems weird. If someone else solves alignment, why would you “not let them claim to have solved it”? And how would you do that? By just refusing to recognize it even if it goes against all the available evidence and causes people to take you less seriously?
No, I would do it by rushing to publish my work before it’s been cleaned up enough to be presentable. Scientist have done this throughout history have rushed to avoid getting scooped, and to scoop others. I do not wish to be the Rosalind Franklin of the alignment problem.
Why do you care so much about being first out the door, so much so that your willing to look like a clown/crackpot along the way?
The existing writings, from what I can see, don’t exactly portray the writer as a bonafide genius, so at best folks will perceive you as a moderately above average person with some odd tendencies/preferences, who got unusually lucky.
And then promptly forget about it when the genuine geniuses publish their highly credible results.
And that’s assuming it is even solvable, which seems to be increasingly not the case.
Well, I’ll just have to continue being first out the door, then, won’t I?
No, it’s not going to get you credit. That’s not how credit works in science or anywhere. It goes not to the first who had the idea, but the first who successfully popularized it. That’s not fair, but that’s how it works.
You can give yourself credit or try to argue for it based on evidence of early publication, but would delaying another day to polish your writing a little matter for being first out the door?
I’m sympathetic to your position here, I’ve struggled with similar questions, including wondering why I’m getting downvoted even after trying to get my tone right, and having what seem to me like important, well-explained contributions.
Recognizing that the system isn’t going to be completely fair or efficient and working with it instead of against it is unfortunate, but it’s the smart thing to do in most situations. Attempts to work outside of the existing system only work when they’re either carefully thought out and based on a thorough understanding of why the system works as it does, or they’re extremely lucky.
Historically, I have been extremely, extremely good at delaying publication of what I felt were capabilites-relevant advances, for essentially Yudkowskyan doomer reasons. The only reward I have earned for this diligence is to be treated like a crank when I publish alignment-related research because I don’t have an extensive history of public contribution to the AI field.
Here is my speculation of what Q* is, along with a github repository that implements a shitty version of it, postdated several months.
https://bittertruths.substack.com/p/what-is-q
Same here.
Ask yourself: do you want personal credit, or do you want to help save the world?
Anyway, don’t get discouraged, just learn from those answers and keep writing about those ideas. And learning about related ideas so you can reference them and thereby show what’s new in your ideas. You only got severely downvoted on one, don’t let it get to you any more than you can help.
If the ideas are strong, they’ll win through if you keep at it.
Oh come on, I was on board with your other satire but no rationalist actually says this sort of thing
I wouldn’t say I really do satire? My normal metier is more “the truth, with jokes”. If I’m acting too crazy to be considered a proper rationalist, it’s usually because I am angry or at least deeply annoyed.
I think “read the sequences” is an incredibly unhelpful suggestion. It’s an unrealistic high bar for entry. The sequences are absolutely massive. It’s like saying “read the whole bible before talking to anyone at church”, but even longer. And many newcomers already understand the vast bulk of that content. Even the more helpful selected sequences are two thousand pages.
We need a better introduction to alignment work, LessWrong community standards, and rationality. Until we have it, we need to personally be more helpful to aspiring community members.
See The 101 Space You Will Always Have With You for a thorough and well-argued version of this argument.
If someone is too wrong, and explicitly refuses to update on feedback, it may be impossible to give them a short condensed argument.
(If someone said that Jesus was a space lizard from another galaxy who came to China 10000 years ago, and then he publicly declared that he doesn’t actually care whether God actually exists or not… which specific chapter of the Bible would you recommend him to read to make him understand that he is not a good fit for a Christian web forum? Merely using the “Jesus” keyword is not enough, if everything substantial is different.)
I agree. But telling them to read the sequences is still pointless.
Well, yes. I guess it’s more of an… expression of frustration. Like telling the space-lizard-Jesus guy: “Dude, have you ever read the Bible?” You don’t expect he did, and yes that is the reason why he says what he says… but you also do not really expect him to read it now.
(Then he asks you for help at publishing his own space Bible.)
Well what if he bets a significant amount of money at 2000:1 odds that the Pope will officially add his space Bible to the real Bible as a third Testament after the New Testament within the span of a year?
What if he records a video of himself doing Bible study? What if he offers to pay people their currently hourly rate to watch him do Bible study?
I guess the thrust of my questions here is, at what point do you feel that you become the dick for NOT helping him publish his own space Bible? At what point are you actively impeding new religious discoveries by failing to engage?
For real, literal Christianity, I think there’s no amount of cajoling or argumentation that could lead a Christian to accept the new space Bible. For one thing, until the Pope signs off on it, they would no longer be Christian if they did.
Does rationalism aspire to be more than just another provably-false religion? What would ET Jaynes say about people who fail to update on new evidence?
I agree that my suggestion was not especially helpful.
And now I have my own Sequence! I predict that it will be as unpopular as the rest of my work.
Since you explicitly asked for feedback regarding your downvotes, the “oh, woe is me, my views are so unpopular and my posts keep getting downvoted” lamentations you’ve included in a few of your posts get grating, and might end up self-fulfilling. If you’re saying unpopular things, my advice is to own it, and adopt the “haters gonna hate” attitude: ignore the downvotes completely.
Oh, I do.
(To be clear, we do have automatic regimes that restrict posting and commenting privileges for downvoted users, since we can’t really keep up with the moderation load otherwise, so there are some limits to your ability to ignore them)
I think your automatic restriction is currently too tight. I would suggest making it decay faster.
Agreed. I haven’t suffered from this but the limits seem pretty extreme right now.
Counter, I think the restriction is too loose. There are enough people out there making posts that the real issue is lack of quality, not lack of quantity.
The problem is a long time contributor can be heavily downvoted once and become heavily rate limited, and then it relies on them earning back their points to be able to post again. I wouldn’t say such a thing is necessarily terrible, but it seems to me to have driven away a number of people I was optimistic about who were occasionally saying something many people disagree with and getting heavily downvoted.
I’m not sure I understand this concern. For someone who posts a burst of unpopular (whether for the topic, for the style, or for other reasons) posts, rate limiting seems ideal. It prevents them from digging deeper, while still allowing them to return to positive contribution, and to focus on quality rather than quantity.
I understand it’s annoying to the poster (and I’ve been caught and annoyed myself), but I haven’t seen any that seem like a complete error. I kind of expect the mods would intervene if it were a clear problem, but I also expect the base intervention is advice to slow down.
the rate limiting doesn’t decay until they’ve been upvoted for quite a number of additional comments afterwards.
https://www.lesswrong.com/posts/hHyYph9CcYfdnoC5j/automatic-rate-limiting-on-lesswrong claims it’s net karma on last 20 posts (and last 20 within a month). And total karma, but that’s not an issue for a long-term poster who’s just gotten sidetracked to an unpopular few posts.
So yes, “quite a few”, especially if upvotes are scarcer than downvotes for the poster. But remember, during this time, they ARE posting, just not at the quantity that wasn’t working.
The real question is whether the poster actually changes behavior based on the downvotes and throttling. I do think it’s unfortunate that some topics could theoretically be good for LW, but end up not working. I don’t think it’s problematic that many topics and presentation styles are not possible on LW.
My understanding of the current situation with me is that I am not in fact rate-limited purely by automatic processes currently, but rather by some sort of policy decision on the part of LessWrong’s moderators.
Which is fine, I’ll just continue to post my alignment research on my substack, and occasionally dump linkposts to them in my shortform, which the mods have allowed me continued access to.
Perhaps it is about right, then.
The votes on this comment imply long vol on LW rate limiting.
I am painfully aware of this. I get awfully bored when I’m rate-limited.
And now I am officially rate-limited to one post per week. Be sure to go to my substack if you are curious about what I am up to.
I have read the sequences. Not all of them, because, who has time.
Here is a video of me reading the sequences (both Eliezer’s and my own):
https://bittertruths.substack.com/p/semi-adequate-equilibria