Oh okay. I agree it’s possible there’s no Great Filter.
Julian Bradshaw
Dangit I can’t cease to exist, I have stuff to do this weekend.
But more seriously, I don’t see the point you’re making? I don’t have a particular objection to your discussion of anthropic arguments, but also I don’t understand how it relates to the “what part of evolution/planetary science/sociology/etc. is the Great Filter” scientific question.
I think if you frame it as:
if most individuals exist inside the part of the light cone of an alien civilization, why aren’t we one of them?
Then yes, 1.0 influence and 4.0 influence both count as “part of the light cone”, and so for the related anthropic arguments you could choose to group them together.
But re: anthropic arguments,
Not only am I unable to explain why I’m an observer who doesn’t see aliens
This is where I think I have a different perspective. Granting that anthropic arguments (here, about which observer you are and the odds of that) cause frustration and we don’t want to get into them, I think there is an actual reason why we don’t see aliens—maybe they aren’t there, maybe they’re hiding, maybe it’s all a simulation, whatever—and there’s no strong reason to assume we can’t discover that reason. So, in that non-anthropic sense, in a more scientific inquiry sense, it is possible to explain why I’m an observer who doesn’t see aliens. We just don’t know how to do that yet. The Great Filter is one possible explanation behind the “they aren’t there” answer, and this new information adjusts what we think the filters that would make up the Great Filter might be.
Another way to think about this: suppose we discover that actually science proves life should only arise on 1 in a googol planets. That introduces interesting anthropic considerations about how we ended up as observers on that 1 planet (can’t observe if you don’t exist, yadda yadda). But what I care about here is instead, what scientific discovery proved the odds should be so low? What exactly is the Great Filter that made us so rare?
I agree it’s likely the Great Filter is behind us. And I think you’re technically right, most filters are behind us, and many are far in the past, so the “average expected date of the Great Filter” shifts backward. But, quoting my other comment:
Every other possible filter would gain equally, unless you think this implies that maybe we should discount other evolutionary steps more as well. But either way, that’s still bad on net because we lose probability mass on steps behind us.
So even though the “expected date” shifts backward, the odds for “behind us or ahead of us” shifts toward “ahead of us”.
Let me put it this way: let’s say we have 10 possible filters behind us, and 2 ahead of us. We’ve “lost” one filter behind us due to new information. So, 9 filters behind us gain a little probability mass, 1 filter behind us loses most probability mass, and 2 ahead of us gain a little probability mass. This does increase the odds that the filter is far behind us, since “animal with tool-use intelligence” is a relatively recent filter. But, because “animal with tool-use intelligence” was already behind us and a small amount of that “behind us” probability mass has now shifted to filters ahead of us, the ratio between all past filters and all future filters has adjusted slightly toward future filters.
Interesting thought. I think you have a point about coevolution, but I don’t think it explains away everything in the birds vs. mammals case. How much are birds really competing with mammals vs. other birds/other animals? Mammals compete with lots of animals, why did only birds get smarter? I tend to think intra-niche/genus competition would generate most of the pressure for higher intelligence, and for whatever reason that competition doesn’t seem to lead to huge intelligence gains in most species.
(Re: octopus, cephalopods do have interactions with marine mammals. But also, their intelligence is seemingly different from mammals/birds—strong motor intelligence, but they’re not really very social or cooperative. Hard to compare but I’d put them in a lower tier than the top birds/mammals for the parts of intelligence relevant to the Fermi Paradox.)
In terms of the K-T event, I think it could plausibly qualify as a filter, but asteroid impacts of that size are common enough it can’t be the Great Filter on its own—it doesn’t seem the specific details of the impact (location/timing) are rare enough for that.
Two objections:
Granting that the decision theory that would result from reasoning based on the Fermi Paradox alone is irrational, we’d still want an answer to the question[1] of why we don’t see aliens. If we live in a universe with causes, there ought to be some reason, and I’d like to know the answer.
“why aren’t we born in a civilization which ‘sees’ an old alien civilization” is not indistinguishable from “why aren’t we born in an old [alien] civilization ourselves?” Especially assuming FTL travel limitations hold, as we generally expect, it would be pretty reasonable to expect to see evidence of interstellar civilizations expanding as we looked at galaxies hundreds of millions or billions of lightyears away—some kind of obviously unnatural behavior, such as infrared radiation from Dyson swarms replacing normal starlight in some sector of a galaxy.[2] There should be many more civilizations we can see than civilizations we can contact.
- ^
I’ve seen it argued that the “Fermi Paradox” ought to be called simply the “Fermi Question” instead for reasons like this, and also that Fermi himself seems to have meant it as an honest question, not a paradox. However, it’s better known as the Paradox, and Fermi Question unfortunately collides with Fermi estimation.
- ^
It is technically possible that all interstellar civilizations don’t do anything visible to us—the Dark Forest theory is one variant of this—but that would contradict the “old civilization would contact and absorb ours” part of your reasoning.
Yes. Every other possible filter would gain equally, unless you think this implies that maybe we should discount other evolutionary steps more as well. But either way, that’s still bad on net because we lose probability mass on steps behind us.
Couple takeaways here. First, quoting the article:
By comparing the bird pallium to lizard and mouse palliums, they also found that the neocortex and DVR were built with similar circuitry — however, the neurons that composed those neural circuits were distinct.
“How we end up with similar circuitry was more flexible than I would have expected,” Zaremba said. “You can build the same circuits from different cell types.”
This is a pretty surprising level of convergence for two separate evolutionary pathways to intelligence. Apparently the neural circuits are so similar that when the original seminal paper on bird brains was written in 1969, it just assumed there had to be a common ancestor, and that thinking felt so logical it held for decades afterward.
Obviously, this implies strong convergent pressures for animal intelligence. It’s not obvious to me that artificial intelligence should converge in the same way, not being subject to same pressures all animals face, but we should maybe expect biological aliens to have intelligence more like ours than we’d previously expected.
Speaking of aliens, that’s my second takeaway: if decent-ish (birds like crows/ravens/parrots + mammals) intelligence has evolved twice on Earth, that drops the odds that the “evolve a tool-using animal with intelligence” filter is a strong Fermi Paradox filter. Thus, to explain the Fermi Paradox, we should posit increased odds that the Great Filter is in front of us. (However, my prior for the Great Filter being ahead of humanity is pretty low, we’re too close to AI and the stars—keep in mind that even a paperclipper has not been Filtered, a Great Filter prevents any intelligence from escaping Earth.)
Both the slowdown and race models predict that the future of Humanity is mostly in the hands of the United States—the baked-in disadvantage in chips from existing sanctions on China is crippling within short timelines, and no one else is contending.
So, if the CCP takes this model seriously, they should probably blockade Taiwan tomorrow? It’s the only fast way to equalize chip access over the next few years. They’d have to weigh the risks against the chance that timelines are long enough for their homegrown chip production to catch up, but there seems to be a compelling argument for a blockade now, especially considering the US has unusually tense relations with its allies at the moment.
China doesn’t need to perform a full invasion, just a blockade would be sufficient if you could somehow avoid escalation… though I’m not sure that you could, the US is already taking AI more seriously than China is. (It’s noteworthy that Daniel Kokotajlo’s 2021 prediction had US chip sanctions happening in 2024, when they really happened in 2022.)
Perhaps more AI Safety effort should be going into figuring out a practical method for international cooperation, I worry we’ll face war before we get AIs that can negotiate us out of it as described in the scenarios here.
I’m generally pretty receptive to “adjust the Overton window” arguments, which is why I think it’s good PauseAI exists, but I do think there’s a cost in political capital to saying “I want a Pause, but I am willing to negotiate”. It’s easy for your opponents to cite your public Pause support and then say, “look, they want to destroy America’s main technological advantage over its rivals” or “look, they want to bomb datacenters, they’re unserious”. (yes Pause as typically imagined requires international treaties, the attack lines would probably still work, there was tons of lying in the California SB 1047 fight and we lost in the end)
The political position AI safety has mostly taken instead on US regulation is “we just want some basic reporting and transparency” which is much harder to argue against, achievable, and still pretty valuable.
I can’t say I know for sure this is the right approach to public policy. There’s a reason politics is a dark art, there’s a lot of triangulating between “real” and “public” stances, and it’s not costless to compromise your dedication to the truth like that. But I think it’s part of why there isn’t as much support for PauseAI as you might expect. (the other main part being what 1a3orn says, that PauseAI is on the radical end of opinions in AI safety and it’s natural there’d be a gap between moderates and them)
So I realized Amad’s comment obsession was probably a defense against this dynamic—“I have to say something to my juniors when I see them”.
I think there’s a bit of a trap here where, because Amad is known for always making a comment whenever he ends up next to an employee, if he then doesn’t make a comment next to someone, it feels like a deliberate insult.
That said, I see the same behavior from US tech leadership pretty broadly, so I think the incentive to say something friendly in the elevator is pretty strong to start (norms of equality, first name basis, etc. in tech), and then once you start doing that you have to always do it to avoid insult.
I think the concept of Pausing AI just feels unrealistic at this point.
Previous AI safety pause efforts (GPT-2 release delay, 2023 Open Letter calling for a 6 month pause) have come to be seen as false alarms and overreactions
Both industry and government are now strongly committed to an AI arms race
A lot of the non-AI-Safety opponents of AI want a permanent stop/ban in the fields they care about, not a pause, so it lacks for allies
It’s not clear that meaningful technical AI safety work on today’s frontier AI models could have been done before they were invented; therefore a lot of technical AI safety researchers believe we still need to push capabilities further before a pause would truly be useful
PauseAI could gain substantial support if there’s a major AI-caused disaster, so it’s good that some people are keeping the torch lit for that possibility, but supporting it now means burning political capital for little reason. We’d get enough credit for “being right all along” just by having pointed out the risks ahead of time, and we want to influence regulation/industry now, so we shouldn’t make Pause demands that get you thrown out of the room. In an ideal world we’d spend more time understanding current models, though.
- Mar 31, 2025, 8:32 AM; 9 points) 's comment on Why do many people who care about AI Safety not clearly endorse PauseAI? by (
Copying over a comment from Chris Olah of Anthropic on Hacker News I thought was good: (along with parent comment)
fpgaminer> This is powerful evidence that even though models are trained to output one word at a time
I find this oversimplification of LLMs to be frequently poisonous to discussions surrounding them. No user facing LLM today is trained on next token prediction.
olah3Hi! I lead interpretability research at Anthropic. I also used to do a lot of basic ML pedagogy (https://colah.github.io/). I think this post and its children have some important questions about modern deep learning and how it relates to our present research, and wanted to take the opportunity to try and clarify a few things.
When people talk about models “just predicting the next word”, this is a popularization of the fact that modern LLMs are “autoregressive” models. This actually has two components: an architectural component (the model generates words one at a time), and a loss component (it maximizes probability).As the parent says, modern LLMs are finetuned with a different loss function after pretraining. This means that in some strict sense they’re no longer autoregressive models – but they do still generate text one word at a time. I think this really is the heart of the “just predicting the next word” critique.
This brings us to a debate which goes back many, many years: what does it mean to predict the next word? Many researchers, including myself, have believed that if you want to predict the next word really well, you need to do a lot more. (And with this paper, we’re able to see this mechanistically!)
Here’s an example, which we didn’t put in the paper: How does Claude answer “What do you call someone who studies the stars?” with “An astronomer”? In order to predict “An” instead of “A”, you need to know that you’re going to say something that starts with a vowel next. So you’re incentivized to figure out one word ahead, and indeed, Claude realizes it’s going to say astronomer and works backwards. This is a kind of very, very small scale planning – but you can see how even just a pure autoregressive model is incentivized to do it.
Good objection. I think gene editing would be different because it would feel more unfair and insurmountable. That’s probably not rational—the effect size would have to be huge for it to be bigger than existing differences in access to education and healthcare, which are not fair or really surmountable in most cases—but something about other people getting to make their kids “superior” off the bat, inherently, is more galling to our sensibilities. Or at least mine, but I think most people feel the same way.
Yeah referring to international sentiments. We’d want to avoid a “chip export controls” scenario, which would be tempting I think.
Re: HCAST tasks, most are being kept private since it’s a benchmark. If you want to learn more here’s the METR’s paper on HCAST.
Thanks for the detailed response!
Re: my meaning, you got it correct here:
Spiritually, genomic liberty is individualistic / localistic; it says that if some individual or group or even state (at a policy level, as a large group of individuals) wants to use germline engineering technology, it is good for them to do so, regardless of whether others are using it. Thus, it justifies unequal access, saying that a world with unequal access is still a good world.
Re: genomic liberty makes narrow claims, yes I agree, but my point is that if implemented it will lead to a world with unequal access for some substantial period of time, and that I expect this to be socially corrosive.
Switching to quoting your post and responding to those quotes:
To be honest, mainly I’ve thought about inequality within single economic and jurisdictional regimes. (I think that objection is more common than the international version.)
Yeah that’s the common variant of the concern but I think it’s less compelling—rich countries will likely be able to afford subsidizing gene editing for their citizens, and will be strongly incentivized to do so even if it’s quite expensive. So my expectation is that the intra-country effects for rich countries won’t be as bad as science fiction has generally predicted, but that the international effects will be.
(and my fear is this would play into general nationalizing trends worldwide that increase competition and make nation-states bitter towards each other, when we want international cooperation on AI)
I am however curious to hear examples of technologies that {snip}
My worry is mostly that the tech won’t spread “soon enough” to avoid socially corrosive effects, less so that it will never spread. As for a tech that never fully spread but should have benefitted everyone, all that comes to mind is nuclear energy.
So maybe developing the tech here binds it up with “all people should have this”.
I think this would happen, but it would be expressed mostly resentfully, not positively.
The ideology should get a separate treatment—genomic liberty but as a positive right—what I’ve been calling genomic emancipation.
Sounds interesting!
This is a thoughtful post, and I appreciate it. I don’t think I disagree with it from a liberty perspective, and agree there are potential huge benefits for humanity here.
However, my honest first reaction is “this reasoning will be used to justify a world in which citizens of rich countries have substantially superior children to citizens of poor countries (as viewed by both groups)”. These days, I’m much more suspicious of policies likely to be socially corrosive: it leads to bad governance at a time where, because of AI risk, we need excellent governance.
I’m sure you’ve thought about this question, it’s the classic objection. Do you have any idea how to avoid or at least mitigate the inequality adopting genomic liberty would cause? Or do you think it wouldn’t happen at all? Or do you think that it’s simply worth it and natural that any new technology is first adopted by those who can afford it, and that adoption drives down prices and will spread the technology widely soon enough?
Here’s an interesting thread of tweets from one of the paper’s authors, Elizabeth Barnes.
Quoting the key sections:Extrapolating this suggests that within about 5 years we will have generalist AI systems that can autonomously complete ~any software or research engineering task that a human professional could do in a few days, as well as a non-trivial fraction of multi-year projects, with no human assistance or task-specific adaptations required.
However, (...) It’s unclear how to interpret “time needed for humans”, given that this varies wildly between different people, and is highly sensitive to expertise, existing context and experience with similar tasks. For short tasks especially, it makes a big difference whether “time to get set up and familiarized with the problem” is counted as part of the task or not.
(...)
We’ve tried to operationalize the reference human as: a new hire, contractor or consultant; who has no prior knowledge or experience of this particular task/codebase/research question; but has all the relevant background knowledge, and is familiar with any core frameworks / tools / techniques needed.
This hopefully is predictive of agent performance (given that models have likely memorized most of the relevant background information, but won’t have training data on most individual tasks or projects), whilst maintaining an interpretable meaning (it’s hopefully intuitive what a new hire or contractor can do in 10 mins vs 4hrs vs 1 week).
(...)
Some reasons we might be *underestimating* model capabilities include a subtlety around how we calculate human time. In calculating human baseline time, we only use successful baselines. However, a substantial fraction of baseline attempts result in failure. If we use human success rates to estimate the time horizon of our average baseliner, using the same methodology as for models, this comes out to around 1hr—suggesting that current models will soon surpass human performance. (However, we think that baseliner failure rates are artificially high due to our incentive scheme, so this human horizon number is probably significantly too low)
Other reasons include: For tasks that both can complete, models are almost always much cheaper, and much faster in wall-clock time, than humans. This also means that there’s a lot of headroom to spend more compute at test time if we have ways to productively use it—e.g. BoKThat bit at the end about “time horizon of our average baseliner” is a little confusing to me, but I understand it to mean “if we used the 50% reliability metric on the humans we had do these tasks, our model would say humans can’t reliably perform tasks that take longer than an hour”. Which is a pretty interesting point.
Huh, seems you are correct. They also apparently are heavily cannibalistic, which might be a good impetus for modeling the intentions of other members of your species…