I’m a rising sophomore at the University of Chicago where I co-run the EA group, founded the rationality group, and am studying philosophy and economics/cog sci. I’m largely interested in formal epistemology, metaethics, formal ethics, decision theory, and I have minor interests in a few other areas—I think LessWrong ideas are heavily underrated in philosophy academia, though I have some contentions. I also have a blog where I post about philosophy (and other stuff sometimes) here: https://substack.com/@irrationalitycommunity?utm_source=user-menu.
Noah Birnbaum
A Talmudic Rationalist Cautionary Tale
Makes sense. Good clarification!
I think people should know that this exists (Sam Harris arguing for misaligned AI being an x-risk concern on Big Think YouTube channel):
Chicago – ACX Meetups Everywhere Spring 2025
Only somewhat related, but you may enjoy my post here about meta-normative principles (and various issues that arise from each).
Warren Smith has to my knowledge never managed to publish a paper in a peer-reviewed journal
He did in 2023 (no hate because this article was published in 2018) -- https://link.springer.com/article/10.1007/s10602-023-09425-w.
Caveat: A very relevant point to consider is how long you can take a leave of absence, since some universities allow you to do this indefinitely. Being able to pursue what you want/ need while maintaining optionality seems Pareto better.
US AISI will be ‘gutted,’ Axios reports: https://t.co/blQY9fGL1v. This should have been expected, I think, but it still seems worth sharing,
Downvoted because 1) I don’t think people are too hesitant to downvote, and 2) I think explaining one’s reasoning is a good epistemic hygiene (downvoting and not explaining is like booing when you hear an idea that you don’t like).
My post might be of interest.
This is very interesting research!
One potential critique that I have to this (recognizing that I’m not nearly an expert on the subject, and that this may be a very stupid critique), on the other hand, is that being able to stimulate the environment as if it is the real one to test if they’re faking alignment seems like a pretty robust way to see if some model is actually aligned or not. Alignment faking seems much bad of an issue if we can see how much it’s faking by (I think).
There is what to be said here about how that may be quite bad when the model is “loose,” though.
This is a really good debate on AI doom—I thought the optimistic side was a good model that I (and maybe others) should spend more time thinking about (mostly about the mechanistic explanation vs extrapolation of trends and induction vs empiricist framings), even though I think I disagreed with a lot of it on an object level:
Saying that we should donate there as opposed to AMF, for example, I would argue is trolleying. You’re making tradeoffs and implicitly saying this is worth as much as that. Perhaps you’re giving lower trade offs than the pain pleasure stuff, but you didn’t really mention these, and they seem important to the end claim “and for these reasons, you should donate to shrimp welfare.”
I really don’t like when people downvote so heavily without giving reasons—think this is nicely argued!
One issue I do have is that Bob Fischer, the conductor of the Rethink study, warned about exactly what you are sorta doing here in being like ah now we can use x amount of shrimp and saying we can trolly problem a human for that many. This is just one contention, but I think the point is important and people willing to take weird/ controversial ideas seriously (especially here!) should take it more seriously!
A good short post by Tyler Cowen on anti-AI Doomerism.
I recommend taking a minute to steelman the position before you decide to upvote or downvote this. Even if you disagree with the position object level, there is still value to knowing the models where you may be most mistaken.
I don’t know why you’re getting so many downvotes (okay fine I do it’s because of your tone). Nevertheless, this is an awesome post.
New UChicago Rationality Group
Tyler Cowen often has really good takes (even some good stuff against AI as an x-risk!), but this was not one of them: https://marginalrevolution.com/marginalrevolution/2024/10/a-funny-feature-of-the-ai-doomster-argument.html
Title: A funny feature of the AI doomster argument
If you ask them whether they are short the market, many will say there is no way to short the apocalypse. But of course you can benefit from pending signs of deterioration in advance. At the very least, you can short some markets, or go long volatility, and then send those profits to Somalia to mitigate suffering for a few years before the whole world ends.
Still, in a recent informal debate at the wonderful Roots of Progress conference in Berkeley, many of the doomsters insisted to me that “the end” will come as a complete surprise, given the (supposed) deceptive abilities of AGI.
But note what they are saying. If markets will not fall at least partially in advance, they are saying the passage of time, and the events along the way, will not persuade anyone. They are saying that further contemplation of their arguments will not persuade any marginal investors, whether directly or indirectly. They are predicting that their own ideas will not spread any further.
I take those as signs of a pretty weak argument. “It will never get more persuasive than it is right now!” “There’s only so much evidence for my argument, and never any more!” Of course, by now most intelligent North Americans with an interest in these issues have heard these arguments and they are most decidedly not persuaded.
There is also a funny epistemic angle here. If the next say twenty years of evidence and argumentation are not going to persuade anyone else at the margin, why should you be holding this view right now? What is it that you know, that is so resistant to spread and persuasion over the course of the next twenty years?
I would say that to ask such questions is to answer them.
Thanks! Honestly, I think this kind of project needs to get much more appreciation and should be done more by those who are very confident in their positions and would like to steelman the other side. I also often hear people very confident about their beliefs and truly have no idea what the bets counterarguments are—maybe this uncommon, but I went to an in-person rationalist meetup like last week, and the people were really confident but haven’t heard of a bunch of these counterarguments, which I though is not at all in the LessWrong spirit. That interaction was one of my inspirations for the post.
I think I agree, but I’m having a bit of trouble understanding how you would evaluate arguments so much differently than I am now. I would say my method is pretty different than that of twitter debates (in many ways, I am very sympathetic and influenced by the LessWrong approach). I think I could have made a list of cruxes of each argument, but I didn’t want the post to be too long—much fewer would read it which is why I recommended that people first get a grasp on the general arguments for AI being an existential risk right at the beginning (adding a credence or range, i think, is pretty silly given that people should be able to assign their own, and I’m just some random undergrad on the internet).
Yep—I totally agree. I don’t personally take the argument super seriously (though I attempted to steelman what that argument as I think other people take it very seriously). I was initially going to respond to every argument, but I didn’t want to make a 40+ minute post. I also did qualify that claim a bunch (as I did with others like the intractability argument)
Fair point. I do think the LeCun argument misunderstands a bunch about different aspects of the debate, but he’s probably smarter than me.
I think I’m gonna have to just disagree here. While I defintelely think finding cruxes are extremely important (and this sometimes requires much back and forth), there is a certain type of way arguments can go back and forth that I tend to think has little (and should have little) influence on beliefs—I’m open to being wrong, though!
Different but related point:
I think, generally, I largely agree with you on many things you’ve said and just appreciate the outside view more. A modest epistemology of sorts. Even if I don’t find an argument super compelling, if a bunch of people that I think are pretty smart do (Yann LeCun has done some groundbreaking work in AI stuff, so that seems like a reason to take him seriously), I’m still gonna write about it. This is another reason why I didn’t put credences on these arguments—let the people decide!
Here’s an argument against this view—yes, there is some cost associated with helping the citizens of a country and the benefit becomes less great as you become a rentier state. However, while the benefits do go down and economic prosperity becomes greater and greater for the very few due to AGI, the costs of quality life become significantly cheaper to help others in the society. It is not clear that the rate at which the benefits diminish actually outpaces the reduction in costs of helping people.
In response to this, one might be able to say something like regular people become totally obsolete wrt efficiency and the costs, while reduced stay positive. However, this really depends on how you think human psychology works—while some people would turn on humans the second they can, there are likely some people who will just keep being empathetic (perhaps this is merely a vestigial trait from the past, but it irrelevant—the value exists now, and some people might be willing to pay some cost to avoid shaping this value even beyond their own lives). We have a similar situation in our world: namely, animals—while people aren’t motivated to care about animals for power reasons (they could do all the factory farming they want, and it would be better), some still do (I take it that this is a vestigial trait of generalizing empathy to the abstract, but as stated, the description for why this comes to be seems largely irrelevant).
Because of how cheap it is to actually help someone in this world, you may just need one or a few people to care just a little bit about helping people and that could make everyone better off. Given that we have a bunch of vegans now (the equivalent to empathetic but powerful people post AGI), depending on how low the costs are to make lives happy (presumably there is a negative correlation between the costs to make lives better and the inequality of power, money, etc), it might be the case that regular citizens end up pretty alright on the other side.
Curious what people think about this!
Also, many of the links at beginning (YouTube, World Bank, Rentier states, etc) don’t work.