Research coordinator of Stop/Pause area at AI Safety Camp.
See explainer on why AGI could not be controlled enough to stay safe:
lesswrong.com/posts/xp6n2MG5vQkPpFEBH/the-control-problem-unsolved-or-unsolvable
Research coordinator of Stop/Pause area at AI Safety Camp.
See explainer on why AGI could not be controlled enough to stay safe:
lesswrong.com/posts/xp6n2MG5vQkPpFEBH/the-control-problem-unsolved-or-unsolvable
Update: back up to 60% chance.
I overreacted before IMO on the updating down to 40% (and undercompensated when updating down to 80%, which I soon after thought should have been 70%).
The leader in turns of large model revenue, OpenAI has basically failed to build something worth calling GPT-5, and Microsoft is now developing more models in-house to compete with them. If OpenAI fails on the effort to combine its existing models into something new and special (likely), that’s a blow to perception of the industry.
A recession might also be coming this year, or at least in the next four years, which I made a prediction about before.
Update: back up to 50% chance.
Noting Microsoft’s cancelling of data center deals. And the fact the ‘AGI’ labs are still losing cash, and with DeepSeek are competing increasingly on a commodity product.
Update: 40% chance.
I very much underestimated/missed the speed of tech leaders influencing the US government through the Trump election/presidency. Got caught flat-footed by this.
I still think it’s not unlikely for there to be an AI crash as described above within the next 4 years and 8 months but it could be from levels of investment much higher than where we are now. A “large reduction in investment” at that level looks a lot different than a large reduction in investment from the level that markets were at 4 months ago.
Of the recent wave of AI companies, the earliest one, DeepMind, relied on the Rationalists for its early funding. The first investor, Peter Thiel, was a donor to Eliezer Yudkowsky’s Singularity Institute for Artificial Intelligence (SIAI, but now MIRI, the Machine Intelligence Research Institute) who met DeepMind’s founder at an SIAI event. Jaan Tallinn, the most important Rationalist donor, was also a critical early investor…
…In 2017, the Open Philanthropy Project directed $30 million to OpenAI…
Good overview of how through AI Safety funders ended up supporting AGI labs.
Curious to read more people’s views of what this led to. See question here: https://www.lesswrong.com/posts/wWMxCs4LFzE4jXXqQ/what-did-ai-safety-s-specific-funding-of-agi-r-and-d-labs
It’s because you keep making incomprehensible arguments that don’t make any sense
Good to know that this is why you think AI Safety Camp is not worth funding.
Once a core part of the AGI non-safety argument is put into maths to be comprehensible for people in your circle, it’d be interesting to see how you respond then.
I kinda appreciate you being honest here.
Your response is also emblematic of what I find concerning here, which is that you are not offering a clear argument of why something does not make sense to you before writing ‘crank’.
Writing that you do not find something convincing is not an argument – it’s a statement of conviction, which could as much be a reflection of a poor understanding of an argument or of not taking the time to question one’s own premises. Because it’s not transparent about one’s thinking, but still comes across like there must be legit thinking underneath, this can be used as a deflection tactic (I don’t think you are, but others who did not engage much ended the discussion on that note). Frankly, I can’t convince someone if they’re not open to the possibility of being convinced.
I explained above why your opinion was flawed – that ASI would be so powerful that it could cancel all of evolution across its constituent components (or at least anything that through some pathway could build up to lethality).
I similarly found Quintin’s counter-arguments (eg. hinging on modelling AGI as trackable internal agents) to be premised on assumptions that considered comprehensively looked very shaky.
I relate why discussing this feels draining for you. But it does not justify you writing ‘crank’, when you have not had the time to examine the actual argumentation (note: you introduced the word ‘crank’ in this thread; Oliver wrote something else).
Overall, this is bad for community epistemics. It’s better if you can write what you thought was unsound about my thinking, and I can write what I found unsound about yours. Barring that exchange, some humility that you might be missing stuff is well-placed.
Besides this point, the respect is mutual.
Lucius, the text exchanges I remember us having during AISC6 was about the question whether ‘ASI’ could control comprehensively for evolutionary pressures it would be subjected to. You and I were commenting on a GDoc with Forrest. I was taking your counterarguments against his arguments seriously – continuing to investigate those counterarguments after you had bowed out.
You held the notion that ASI would be so powerful that it could control for any of its downstream effects that evolution could select for. This is a common opinion held in the community. But I’ve looked into this opinion and people’s justifications for it enough to consider it an unsound opinion.[1]
I respect you as a thinker, and generally think you’re a nice person. It’s disappointing that you wrote me off as a crank in one sentence. I expect more care, including that you also question your own assumptions.
A shortcut way of thinking about this:
The more you increase ‘intelligence’ (as a capacity in transforming patterns in data), the more you have to increase the number of underlying information-processing components. But the corresponding increase in the degrees of freedom those components have in their interactions with each other and their larger surroundings grows faster.
This results in a strict inequality between:
the space of possible downstream effects that evolution can select across; and
the subspace of effects that the ‘ASI’ (or any control system connected with/in ASI) could detect, model, simulate, evaluate, and correct for.
The hashiness model is a toy model for demonstrating this inequality (incl. how the mismatch between 1. and 2. grows over time). Anders Sandberg and two mathematicians are working on formalising that model at AISC.
There’s more that can be discussed in terms of why and how this fully autonomous machinery is subjected to evolutionary pressures. But that’s a longer discussion, and often the researchers I talked with lacked the bandwidth.
I agree that Remmelt seems kind of like he has gone off the deep end
Could you be specific here?
You are sharing a negative impression (“gone off the deep end”), but not what it is based on. This puts me and others in a position of not knowing whether you are e.g. reacting with a quick broad strokes impression, and/or pointing to specific instances of dialogue that I handled poorly and could improve on, and/or revealing a fundamental disagreement between us.
For example, is it because on Twitter I spoke up against generative AI models that harm communities, and this seems somehow strategically bad? Do you not like the intensity of my messaging? Or do you intuitively disagree with my arguments about AGI being insufficiently controllable?
As is, this is dissatisfying. On this forum, I’d hope[1] there is a willingness to discuss differences in views first, before moving to broadcasting subjective judgements[2] about someone.
Even though that would be my hope, it’s no longer my expectation. There’s an unhealthy dynamic on this forum, where 3+ times I noticed people moving to sideline someone with unpopular ideas, without much care.
To give a clear example, someone else listed vaguely dismissive claims about research I support. Their comment lacked factual grounding but still got upvotes. When I replied to point out things they were missing, my reply got downvoted into the negative.
I guess this is a normal social response on most forums. It is naive of me to hope that on LessWrong it would be different.
This particularly needs to be done with care if the judgement is given by someone seen as having authority (because others will take it at face value), and if the judgement is guarding default notions held in the community (because that supports an ideological filter bubble).
For example, it might be the case that, for some reason, alignment would only have been solved if and only if Abraham Lincoln wasn’t assassinated in 1865. That means that humans in 2024 in our world (where Lincoln was assasinated in 1865) will not be able to solve alignment, despite it being solvable in principle.
With this example, you might still assert that “possible worlds” are world states reachable through physics from past states of the world. Ie. you could still assert that alignment possibility is path-dependent from historical world states.
But you seem to mean something broader with “possible worlds”. Something like “in theory, there is a physically possible arrangement of atoms/energy states that would result in an ‘aligned’ AGI, even if that arrangement of states might not be reachable from our current or even a past world”.
–> Am I interpreting you correctly?
Alignment is a broad word, and I don’t really have the authority to interpret stranger’s words in a specific way without accidentally misrepresenting them.
You saying this shows the ambiguity here of trying to understand what different people mean. One researcher can make a technical claim about the possibility/tractability of “alignment” that is similarly worded to a technical claim others made. Yet their meaning of “alignment” could be quite different.
It’s hard then to have a well-argued discussion, because you don’t know whether people are equivocating (ie. switching between different meanings of the term).
one article managed to find six distinct interpretations of the word:
That’s a good summary list! I like the inclusion of “long-term outcomes” in P6. In contrast, P4 could just entail short-term problems that were specified by a designer or user who did not give much thought to long-term repercussions.
The way I deal with the wildly varying uses of the term “alignment” is to use a minimum definition that most of those six interpretations are consistent with. Where (almost) everyone would agree that AGI not meeting that definition would be clearly unaligned.
Alignment is at the minimum the control of the AGI’s components (as modified over time) to not (with probability above some guaranteeable high floor) propagate effects that cause the extinction of humans.
Thanks!
With ‘possible worlds’, do you mean ‘possible to be reached from our current world state’?
And what do you mean with ‘alignment’? I know that can sound like an unnecessary question. But if it’s not specified, how can people soundly assess whether it is technically solvable?
Thanks, when you say “in the space of possible mathematical things”, do you mean “hypothetically possible in physics” or “possible in the physical world we live in”?
Here’s how I specify terms in the claim:
AGI is a set of artificial components, connected physically and/or by information signals over time, to in aggregate sense and act autonomously over many domains.
‘artificial’ as configured out of a (hard) substrate that can be standardised to process inputs into outputs consistently (vs. what our organic parts can do).
‘autonomously’ as continuing to operate without needing humans (or any other species that share a common ancestor with humans).
Alignment is at the minimum the control of the AGI’s components (as modified over time) to not (with probability above some guaranteeable high floor) propagate effects that cause the extinction of humans.
Control is the implementation of (a) feedback loop(s) through which the AGI’s effects are detected, modelled, simulated, compared to a reference, and corrected.
Good to know. I also quoted your more detailed remark on AI Standards Lab at the top of this post.
I have made so many connections that have been instrumental to my research.
I didn’t know this yet, and glad to hear! Thank you for the kind words, Nell.
Fair question. You can assume it is AoE.
Research leads are not going to be too picky in terms of what hour you send the application in,
There is no need to worry about the exact deadline. Even if you send in your application on the next day, that probably won’t significantly impact your chances of getting picked up by your desired project(s).
Sooner is better, since many research leads will begin composing their teams after the 17th, but there is no hard cut-off point.
Update: back up to 70% chance.
Just spent two hours compiling different contributing factors. Now I weighed those factors up more comprehensively, I don’t expect to change my prediction by more than ten percentage points over the coming months. Though I’ll write here if I do.
My prediction: 70% chance that by August 2029 there will be a large reduction in investment in AI and a corresponding market crash in AI company stocks, etc, and that both will continue to be for at least three months.
For:
Large model labs losing money
OpenAI made loss of ~$5 billion last year.
Takes most of the consumer and enterprise revenue, but still only $3.7 billion.
GPT 4.5 model is the result of 18 months of R&D, but only a marginal improvement in output quality, while even more compute intensive.
If OpenAI publicly fails, as the supposed industry leader, this can undermine the investment narrative of AI as a rapidly improving and profitable technology, and trigger a market meltdown.
Commoditisation
Other models by Meta, etc, around as useful for consumers.
DeepSeek undercuts US-designed models with compute-efficient open-weights alternative.
Data center overinvestment
Microsoft cut at least 14% of planned data center expansion.
Subdued commercial investment interest.
Some investment firm analysts skeptical, and second-largest VC firm Sequoia Capital also made a case of lack of returns for the scale of investment ($600+ billion).
SoftBank is the main other backer of the Stargate data center expansion project, and needs to raise debt to do raise ~$18 billion. OpenAI also needs to raise more investment funds next round to cover ~$18 billion, with question whether there is interest
Uncertainty US government funding
Mismatch between US Defense interest and what large model labs are currently developing.
Model ‘hallucinations’ get in the way of deployment of LLMs on the battlefield, given reliability requirements.
On the other hand, this hasn’t prevented partnerships and attempts to deploy models.
Interest in data analysis of integrated data streams (e.g. by Palantir) and in self-navigating drone systems (e.g. by Anduril).
The Russo-Ukrainian war and Gaza invasion have been testbeds, but seeing relatively rudimentary and straightforward AI models being used there (Ukraine drones are still mostly remotely operated by humans, and Israel used an LLM for shoddy target identification).
No clear sign that US administration is planning to subsidise large model development.
Stargate deal announced by Trump did not involve government chipping in money.
Likelihood of a (largish) US economic recession by 2029.
Debt/misinvestment overload after long period of low interest.
Early signs, but nothing definitive:
Inflation
Reduced consumer demand
Business uncertainty amidst changing tariffs.
Generative AI subscriptions seem to be a luxury expense for most people rather than essential for completing work (particularly because ~free alternatives exist to switch to and for most users those aren’t significantly different in use). Enterprises and consumers could cut heavily on their subscriptions once facing a recession.
Early signs of large progressive organising front, hindering tech-conservative allyships.
#TeslaTakedown.
Various conversations by organisers with a renewed motivation to be strategic.
Last few years’ resurgence of ‘organising for power’ union efforts, overturning top-down mobilising and advocacy approaches.
Increasing awareness of fuck-ups in the efficiency drives by Trump-Musk administration coalition.
Against:
Current US administration’s strong public stance on maintaining America’s edge around AI.
Public announcements.
JD Vance’s speech at the renamed AI Action Summit.
Clearing out regulation
Scrapped Biden AI executive order.
Copyright
Talks as in UK and EU about effectively scrapping copyright for AI training materials (with opt-out laws, or by scrapping opt-out too).
Stopping enforcement of regulation
Removing Lina Khan at head of FTC, which were investigating AI companies.
Musk internal dismantling of departments engaged in oversight.
Internal deployment of AI model for (questionable) uses.
US IRS announcement.
DOGE attempts of using AI to automate evaluation and work by bureacrats.
Accelerationist lobby’s influence been increasing.
Musk, Zuckerberg, Andreessen, other network state folks, etc, been very strategic in
funding and advising politicians,
establishing coalitions with people on the right (incl. Christian conservatives, and channeling populist backlashes against globalism and militant wokeness),
establishing social media platforms for amplifying their views (X, network of popular independent podcasts like Joe Rogan show).
Simultaneous gutting of traditional media.
Faltering anti-AI lawsuits
Signs of corruption of plaintiff lawyers,
e.g. in case against Meta, where crucial arguments were not made, and judge considered not allowing class representation.
Defense contracts
US military has budget in the trillions of dollars, and could in principle keep the US AI corporations propped up.
Possibility that something changes geopolitically (war threat?) resulting in large funds injection.
Guess Pentagon already treating AGI labs such as OpenAI and Anthropic as a strategic asset (to control, and possibly prop up if their existence is threatened).
Currently seeing cross-company partnerships.
OpenAI with Anduril, Anthropic with Palantir.
National agenda pushes to compete in various countries.
Incl. China, UK, EU.
Recent increased promotion/justification in and around US political circles of the need to compete with China.
New capability development
Given the scale of AI research happening now, it is quite possible that some teams will develop of new cross-domain-optimising model architecture that’s data and compute efficient.
As researchers come to acknowledge the failure of the ‘scaling laws’ focussed approach using existing transformer architectures (given limited online-available data, and reduced marginal returns on compute), they will naturally look for alternative architecture designs to work on.