Independent alignment researcher
I have signed no contracts or agreements whose existence I cannot mention.
Independent alignment researcher
I have signed no contracts or agreements whose existence I cannot mention.
After that I was writing shorter posts but without long context the things I write are very counterintuitive. So they got ruined)
This sounds like a rationalization. It seems much more likely the ideas just aren’t that high quality if you need a whole hour for a single argument that couldn’t possibly be broken up into smaller pieces that don’t suck.
Edit: Since if the long post is disliked, you can say “well they just didn’t read it”, and if the short post is disliked you can say “well it just sucks because its small”. Meanwhile, it should in fact be pretty surprising you don’t have any interesting or novel or useful insights in your whole 40 minute post which can’t be explained in a reasonable length of blog post time.
usually if you randomly get a downvote early instead of an upvote, so your post has “-1” karma now, then no one else will open or read it
I will say that I often do read −1 downvoted posts, I will also say that much of the time it is deserved, despite how noisy a signal it may be.
Some of my articles take 40 minutes to read, so it can be anything, downvotes give me zero information and just demotivate more and more.
I think you should try writing shorter posts. Both for your sake (so you get more targeted information), and for the readers’ sake.
https://www.science.org/content/blog-post/alkanes-mars there are alkanes—big organic molecules—on Mars. these can be produced by abiotic processes, but usually that makes shorter chains than these. so....life? We Shall See.
Very exciting! I think the biggest “loophole” here is probably that they used a novel technique for detection, maybe if we used that technique more we would have to update the view that such big molecules are so unlikely to be produced non-biologically.
I’m a bit skeptical, there’s a reasonable amount of passed-down wisdom I’ve heard claiming (I think justifiably) that
If you write messy code, and say “I’ll clean it later” you probably won’t. So insofar as you eventually want to discover something others build upon, you should write it clean from the start.
Clean code leads to easier extensibility, which seems pretty important eg if you want to try a bunch of different small variations on the same experiment.
Clean code decreases the number of bugs and the time spent debugging. This seems especially useful insofar as you are trying to rule-out hypotheses with high confidence, or prove hypotheses with high confidence.
Generally (this may be double-counting 2 and 3), paradoxically, clean code is faster rather than dirty code.
You say you came from a more SWE based paradigm though, so you probably know all this already.
Ok first, when naming things I think you should do everything you can to not use double-negatives. So you should say “gym average” or “no gym average”. Its shorter, and much less confusing.
Second, I’m still confused. Translating what you said, we’d have “no gym removed average” → “gym average” (since you remove everyone who doesn’t go to the gym meaning the only people remaining go to the gym), and “gym removed average” → “no gym average” (since we’re removing everyone who goes to the gym meaning the only remaining people don’t go to the gym).
Therefore we have,
gym average = no gym removed average < gym removed average = no gym average
So it looks like the gym doesn’t help, since those who don’t go to the gym have a higher average number of pushups they can do than those who go to the gym.
Note: You can verify this is the case by filtering for male respondents with male partners and female respondents with female partners in the survey data
I think the math works out to be that the variation is much more extreme when you get to much more extreme probabilities. Going from 4% to 8% is 2x profits, but going from 50% to 58% is only 1.16x profits.
This seems likely to depend on your preferred style of research, so what is your preferred style of research?
And then if we say the bottleneck to meritocracy is mostly c rather than a or b, then in fact it seems like our society is absolutely obsessed with making our institutions highly accessible to as broad a pool of talent as possible. There are people who make a whole career out of just advocating for equality.
I work at GDM so obviously take that into account here, but in my internal conversations about external benchmarks we take cheating very seriously—we don’t want eval data to leak into training data, and have multiple lines of defense to keep that from happening.
What do you mean by “we”? Do you work on the pretraining team, talk directly with the pretraining team, are just aware of the methods the pretraining team uses, or some other thing?
More to the point, I haven’t seen people try to scale those things either. The closest might be something like TripleByte? Or headhunting companies? Certainly when I think of a typical (or 95th-99th percentile) “person who says they care a lot about meritocracy” I’m not imagining a recruiter, or someone in charge of such a firm. Are you?
I think much of venture capital is trying to scale this thing, and as you said they don’t use the framework you use. The philosophy there is much more oriented towards making sure nobody falls beneath the cracks. Provide the opportunity, then let the market allocate the credit.
That is, the way to scale meritocracy turns out to be maximizing c rather than the other considerations you listed, on current margins.
Also this conclusion is highly dependent on you, who has thought about this topic for all of 10 minutes, out-thinking the hypothetical people who are actually serious about meritocracy. For example perhaps they do more one-on-one talent scouting or funding, which is indeed very very common and seems to be much more in-demand than psychometric evaluations.
Given that ~ no one really does this, I conclude that very few people are serious about moving towards a meritocracy.
The field you should look at I think is Industrial and Organizational Psychology, as well as the classic Item Response Theory.
I suspect the vast majority of that sort of name-calling is much more politically motivated than based on not seeing the right slogans. For example if you go to Pause AI’s website the first thing you see is a big, bold
and AI pause advocates are constantly arguing “no, we don’t actually believe that” to the people who call them “luddites”, but I have never actually seen anyone change their mind based on such an argument.
I don’t think Pause AI’s current bottleneck is people being pro AI in general not wanting to join (but of course I could be wrong). Most people are just against AI, and Pause AI’s current strategy is to make them care enough about the issue to use their feet, while also telling them “its much much worse than you would’ve imagined bro”.
That’s a whole seven words!, most of which are a whole three syllables! There is no way a motto like that catches on.
An effect I noticed: Going through Aella’s correlation matrix (with poorly labeled columns sadly), a feature which strongly correlates with the length of a relationship is codependency. Plotting question 20. "The long-term routines and structure of my life are intertwined with my partner's" (li0toxk)
assuming that’s what “codependency” refers to
The shaded region is a 95% posterior estimate for the mean of the distribution conditioned on the time-range (every 2 years) and cis-male respondents, with prior .
Note also that codependency and sex satisfaction are basically uncorrelated
This shouldn’t be that surprising. Of course the longer two people are together the more their long term routines will be caught up with each other. But also this seems like a very reasonable candidate for why people will stick together even without a good sex life.
This seems right as a criticism, but this seems better placed on the EA forum. I can’t remember the last time I heard anyone talking about ITN on LessWrong. There are many considerations ITN leaves out, which should be unsurprising given how simplified it is.
Note the error bars in the original