Thane Ruthenis

Karma: 6,629

Thane Ruthenis Apr 12, 2025, 12:06 PM
4 points
2
in reply to: StanislavKrym’s comment on: A Bear Case: My Predictions Regarding AI Progress
Indeed, and maintaining this release schedule is indeed a bit impressive. Though note that “a model called o4 is released” and “the pace of progress from o1 to o3 is maintained” are slightly different. Hopefully the release is combined with a proper report on o4 (not just o4-mini), so we get actual data regarding how well RL-on-CoTs scales.

Thane Ruthenis Apr 11, 2025, 7:57 PM
5 points
4
in reply to: Daniel Kokotajlo’s comment on: AI 2027: What Superintelligence Looks Like
FWIW, that’s not a crux for me. I can totally see METR’s agency-horizon trend continuing, such that 21 months later, the SOTA model beats METR’s 8-hour tests. What I expect is that this won’t transfer to real-world performance: you wouldn’t be able to plop that model into a software engineer’s chair, prompt it with the information in the engineer’s workstation, and get one workday’s worth of output from it.
At least, not reliably and not in the generel-coding setting. It’s possible this sort of performance would be achieved in some narrow domains, and that this would happen once in a while on any task. (Indeed, I think that’s already the case?) And I do expect nonzero extension of general-purpose real-world agency horizons. But what I expect is slower growth, with the real-world performance increasingly lagging behind the performance on the agency-horizon benchmark.

Thane Ruthenis Apr 11, 2025, 4:41 PM
6 points
2
in reply to: Vladimir_Nesov’s comment on: On Google’s Safety Plan
Fair point. The question of the extent to which those documents can be taken seriously as statements of company policy (as opposed to only mattering in signaling games) is still worthwhile, I think.

Thane Ruthenis Apr 11, 2025, 4:03 PM
24 points
16
on: On Google’s Safety Plan
I can never tell how seriously to take those types of documents.
On one hand, AGI labs obviously have employees, including senior employees, who genuinely take the risks seriously (most notably, some very well-respected LW users, e. g. some of this specific document’s authors). I’m sure the people writing them are writing them in good faith.
On the other hand, the documents somehow never end up containing recommendations that would be drastically at odds with “race full steam ahead” (see the rather convenient Core Assumption 5 here, and subsequent “just do the standard thing plus amplified oversight” alignment plan) or opinions that could cause significant concern (see “not feeling the AGI/Singularity” in “3.6. Benefits of AGI”). And I have a nagging suspicion that if there’s ever a situation where the capability-maximizing thing to do would end up at odds with a recommendation from a published safety plan, the safety plan would be unceremoniously ignored/loopholed-around/amended. I think we saw instances of that already, and not only from OpenAI.
My current instinct is to just tune them out, on the assumption that the AGI lab in question (as opposed to the people writing the document) views them as just some nice-sounding non-binding PR.^[1] Am I wrong to view it this way?
1. ^
  Poking holes in which is still important, kudos, Zvi.

Thane Ruthenis Apr 8, 2025, 3:04 PM
LW: 2 AF: 1
0
AF
on: What 2026 looks like (Daniel’s Median Future)
Trying to evaluate this forecast in order to figure out how update on the newer one.
It certainly reads as surprisingly prescient. Notably, it predicts both the successes and the failures of the LLM paradigm: the ongoing discussion regarding how “shallow” or not their understanding is, the emergence of the reasoning paradigm, the complicated LLM bureaucracies/scaffolds, lots of investment in LLM-wrapper apps which don’t quite work, the relative lull of progress in 2024, troubles with agency and with generating new ideas, “scary AI” demos being dismissed because LLMs do all kinds of whimsical bullshit...
And it was written in the base-GPT-3 era, before ChatGPT, before even the Instruct models. I know I couldn’t have come close to calling any of this back then. Pretty wild stuff.
In comparison, the new “AI 2027” scenario is very… ordinary. Nothing that’s in it is surprising to me, it’s indeed the “default” “nothing new happens” scenario in many ways.
But perhaps the difference is in the eye of the beholder. Back in 2021, I barely knew how DL worked, forget being well-versed in deep LLM lore. The real question is, if I had been as immersed in the DL discourse in 2021 as I am now, would this counterfactual 2021!Thane have considered this forecast as standard as the AI 2027 forecast seems to 2025!Thane?
More broadly: “AI 2027” seems like the reflection of the default predictions regarding AI progress in certain well-informed circles/subcultures. Those circles/subcultures are fairly broad nowadays; e. g., significant parts of the whole AI Twitter. Back in 2021, the AI subculture was much smaller… But was there, similarly, an obviously maximally-well-informed fraction of that subculture which would’ve considered “What 2026 Looks Like” the somewhat-boring default prediction?
Reframing: @Daniel Kokotajlo, do you recall how wildly speculative you considered “What 2026 Looks Like” at the time of writing, and whether it’s more or less speculative than “AI 2027″ feels to you now? (And perhaps the speculativeness levels of the pre-2027 and post-2027 parts of the “AI 2027” report should be evaluated separately here.)
Another reframing: To what extent do you think your alpha here was in making unusually good predictions, vs. in paying attention to the correct things at a time when no-one focused on them, then making fairly basic predictions/extrapolations? (Which is important for evaluating how much your forecasts should be expected to “beat the (prediction) market” today, now that (some parts of) that market are paying attention to the right things as well.)

Thane Ruthenis Apr 7, 2025, 5:28 PM
6 points
0
in reply to: ryan_greenblatt’s comment on: AI 2027: What Superintelligence Looks Like
Notably, the trend in the last few years is that AI companies triple their revenue each year
Hm, I admittedly only skimmed the Compute Forecast article, but I don’t think there’s much evidence for a trend like this? The “triples every year” statement seems to be extrapolated from two data points about OpenAI specifically (“We use OpenAI’s 2023 revenue of $1B and 2024 revenue around $4B to to piece together a short term trend that we expect to slow down gradually”, plus maybe this). I guess you can draw a straight line through two points, and the idea of this trend following straight lines doesn’t necessarily seem unconvincing a-priori… But is there more data?
50% algorithmic progress
Yeah, I concur with all of that: some doubts about 50% in April 2026, some doubts about 13% today, but seems overall not implausible.

Thane Ruthenis Apr 7, 2025, 3:16 PM
5 points
0
on: AI 2027: What Superintelligence Looks Like
Excellent work!
Why our uncertainty increases substantially beyond 2026
Notably, it’s also the date at which my model diverges from this forecast’s. That’s surprisingly later than I’d expected.
Concretely,
OpenBrain doubles down on this strategy with Agent-2. It is qualitatively almost as good as the top human experts at research engineering (designing and implementing experiments), and as good as the 25th percentile OpenBrain scientist at “research taste” (deciding what to study next, what experiments to run, or having inklings of potential new paradigms).
I don’t know that the AGI labs in early 2027 won’t be on a trajectory to automate AI R&D. But I predict that a system trained the way Agent-2 is described to be trained here won’t be capable of the things listed.
I guess I’m also inclined to disagree with parts of the world-state predicted by early 2026, though it’s murkier on that front. Agent-1′s set of capabilities seems very plausible, but what I’m skeptical regarding are the economic and practical implications (AGI labs’ revenue tripling and 50% faster algorithmic progress). As in,
People naturally try to compare Agent-1 to humans, but it has a very different skill profile. It knows more facts than any human, knows practically every programming language, and can solve well-specified coding problems extremely quickly. On the other hand, Agent-1 is bad at even simple long-horizon tasks, like beating video games it hasn’t played before.
Does that not constitute just a marginal improvement on the current AI models? What’s the predicted phase shift that causes the massive economic implications and impact on research?
I assume it’s the jump from “unreliable agents” to “reliable agents” somewhere between 2025 to 2026. It seems kind of glossed over; I think that may be an earlier point at which I would disagree. Did I miss a more detailed discussion of it somewhere in the supplements?

Thane Ruthenis Apr 7, 2025, 6:12 AM
15 points
8
in reply to: habryka’s comment on: AI 2027: What Superintelligence Looks Like
The latest generation of thinking models can definitely do agentic frontend development
But does that imply that they’re general-purpose competent agentic programmers? The answers here didn’t seem consistent with that. Does your experience significantly diverge from that?
My current model is that it’s the standard “jagged capabilities frontier” on a per-task basis, where LLMs are good at some sufficiently “templated” projects, and then they fall apart on everything else. Their proficiency at frontend development is then mostly a sign of frontend code being relatively standardized^[1]; not of them being sufficiently agent-y.
I guess quantifying it as “20% of the way from an amateur to a human pro” isn’t necessarily incorrect, depending on how you operationalize this number. But I think it’s also arguable that they haven’t actually 100%’d even amateur general-coding performance yet.
1. ^
  I. e., that most real-world frontend projects have incredibly low description length if expressed in the dictionary of some “frontend templates”, with this dictionary comprehensively represented in LLMs’ training sets.
  (To clarify: These projects’ Kolmogorov complexity can still be high, but their cross-entropy relative to said dictionary is low.
  Importantly, the cross-entropy relative to a given competent programmer’s “template-dictionary” can also be high, creating the somewhat-deceptive impression of LLMs being able to handle complex projects. But that apparent capability would then fail to generalize to domains in which real-world tasks aren’t short sentences in some pretrained dictionary. And I think we are observing that with e. g. nontrivial backend coding?)

Thane Ruthenis Apr 6, 2025, 5:42 PM
16 points
4
in reply to: Stephen Fowler’s comment on: Stephen Fowler’s Shortform
Having a second Google account specifically for AI stuff seems like a straightforward solution to this? That’s what I do, at least. Switching between them is easy.

Thane Ruthenis Apr 4, 2025, 9:16 AM
4 points
0
in reply to: Wei Dai’s comment on: AI #110: Of Course You Know…
Technological progress leading to ever-better, ever-more-flexible communication technology, which serves as an increasingly more efficient breeding ground for ever-more-viral memes – and since virality is orthogonal to things like “long-term wisdom”, the society ends up taken over by unboundedly destructive ideas?

Thane Ruthenis Apr 4, 2025, 2:20 AM
4 points
0
on: AI #110: Of Course You Know…
I mention this up top in an AI post despite all my efforts to stay out of politics, because in addition to torching the American economy and stock market and all of our alliances and trade relationships in general, this will cripple American AI in particular.
Are we in a survival-without-dignity timeline after all? Big if true.
(Inb4 we keep living in Nerd Hell and it somehow mysteriously fails to negatively impact AI in particular.)

Thane Ruthenis Apr 1, 2025, 3:52 AM
4 points
2
in reply to: funnyfranco’s comment on: On Downvotes, Cultural Fit, and Why I Won’t Be Posting Again
Competitive agents will not choose to in order to beat the competition
Competitive agents will chose to commit suicide, knowing it’s suicide, to beat the competition? That suggests that we should observe CEOs mass-poisoning their employees, Jonestown-style, in a galaxy-brained attempt to maximize shareholder value. How come that doesn’t happen?
Are you quite sure the underlying issue here is not that the competitive agents don’t believe the suicide race to be a suicide race?

Thane Ruthenis Apr 1, 2025, 1:52 AM
2 points
0
in reply to: funnyfranco’s comment on: On Downvotes, Cultural Fit, and Why I Won’t Be Posting Again
alignment will be optimised away, because any system that isn’t optimising as hard as possible won’t survive the race
Off the top of my head, this post. More generally, this is an obvious feature of AI arms races in the presence of alignment tax. Here’s a 2011 writeup that lays it out:
Given abundant time and centralized careful efforts to ensure safety, it seems very probable that these risks could be avoided: development paths that seemed to pose a high risk of catastrophe could be relinquished in favor of safer ones. However, the context of an arms race might not permit such caution. A risk of accidental AI disaster would threaten all of humanity, while the benefits of being first to develop AI would be concentrated, creating a collective action problem insofar as tradeoffs between speed and safety existed.
I assure you the AI Safety/Alignment field has been widely aware of it since at least that long ago.
Also,
alignment will be optimised away, because any system that isn’t optimising as hard as possible won’t survive the race
Any (human) system that is optimizing as hard as possible also won’t survive the race. Which hints at what the actual problem is: it’s not even that we’re in an AI arms race, it’s that we’re in an AI suicide race which the people racing incorrectly believe to be an AI arms race. Convincing people of the true nature of what’s happening is therefore a way to dissolve the race dynamic. Arms races are correct strategies to pursue under certain conditions; suicide races aren’t.

Thane Ruthenis Mar 31, 2025, 10:07 PM
20 points
11
on: On Downvotes, Cultural Fit, and Why I Won’t Be Posting Again
I’ve skimmed™ what I assume is your “main essay”. Thoughtless Kneejerk Reaction™ follows:
- You are preaching to the choir. Most of it are 101-level arguments in favor of AGI risk. Basically everyone on LW has already heard them, and either agrees vehemently, or disagrees with some subtler point/assumption which your entry-level arguments don’t cover. The target audience for this isn’t LWers, this is not content that’s novel and useful for LWers. That may or may not be grounds for downvoting it (depending on one’s downvote philosophy), but is certainly grounds for not upvoting it and for not engaging with it.
  - The entry-level arguments have been reiterated here over and over and over and over again, and it’s almost never useful, and everyone’s sick of them, and you essay didn’t signal that engaging with you on them would be somehow unusually productive.
  - If I am wrong, prove me wrong: quote whatever argument of yours you think ranks the highest on novelty and importance, and I’ll evaluate it.
- The focus on capitalism likely contributed to the “this is a shallow low-insight take” impression. The problem isn’t “capitalism”, it’s myopic competitive dynamics/Moloch in general. Capitalism exhibits lots of them, yes. But a bunch of socialist/communist states would fall into the same failure mode; a communist world government would fall into the same failure mode (inasmuch as it would still involve e. g. competition between researchers/leaders for government-assigned resources and prestige). Pure focus on capitalism creates the impression that you’re primarily an anti-capitalism ideologue who’s aiming co-opt the AGI risk for that purpose.
  - A useful take along those lines might be to argue that we can tap into the general public’s discontent with capitalism to more persuasively argue the case for the AGI risk, followed by an analysis regarding specific argument structures which would be both highly convincing and truthful.
- Appending an LLM output at the end, as if it’s of inherent value, likely did you no favors.
I’m getting the impression that you did not familiarize yourself with LW’s culture and stances prior to posting. If yes, this is at the root of the problems you ran into.
Edit:
Imagine for a moment that an amateur astronomer spots an asteroid on a trajectory to wipe out humanity. He doesn’t have a PhD. He’s not affiliated with NASA. But the evidence is there. And when he contacts the people whose job it is to monitor the skies, they say: “Who are you to discover this?” And then refuse to even look in the direction he’s pointing.
A more accurate analogy would involve the amateur astronomer joining a conference for people discussing how to divert that asteroid, giving a presentation where he argues for the asteroid’s existence using low-resolution photos and hand-made calculations (to a room full of people who’ve observed the asteroid through the largest international telescopes or programmed supercomputer simulations of its trajectory), and is then confused why it’s not very well-received.
What links here?
- Thane Ruthenis's comment on An Unbiased Evaluation of My Debate with Thane Ruthenis—Run It Yourself by funnyfranco (Apr 14, 2025, 2:17 AM; 1 point)

Thane Ruthenis Mar 30, 2025, 1:39 AM
5 points
0
on: Thane Ruthenis’s Shortform
It’s been more than three months since o3 and still no o4, despite OpenAI researchers’ promises.
Deep Learning has officially hit a wall. Schedule the funeral.
[/taunting_god]

Thane Ruthenis Mar 30, 2025, 1:26 AM
4 points
−2
on: Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle
o3 doesn’t handle it either.

Thane Ruthenis Mar 29, 2025, 7:21 PM
5 points
0
in reply to: winstonBosan’s comment on: Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle
I don’t think that’s an issue here at all. Look at the CoTs: it has no trouble whatsoever splitting higher-level expressions into concatenations of blocks of nested expressions and figuring out levels of nesting.

Thane Ruthenis Mar 27, 2025, 9:09 PM
6 points
0
in reply to: Daniel Tan’s comment on: Daniel Tan’s Shortform
Counterargument: Doing it manually teaches you the skills and the strategies for autonomously attaining high levels of understanding quickly and data-efficiently. Those skills would then generalize to cases in which you can’t consult anyone, such as cases where the authors are incommunicado, dead, or don’t exist/the author is the raw reality. That last case is particularly important for doing frontier research: if you’ve generated a bunch of experimental results and derivations, the skills to make sense of what it all means have a fair amount of overlap with the skills for independently integrating a new paper into your world-models.
Of course, this is primarily applicable if you expect research to be a core part of your career, and it’s important to keep in mind that “ask an expert for help” is an option. Still, I think independent self-studies can serve as good “training wheels”.

Thane Ruthenis Mar 27, 2025, 8:43 PM
2 points
2
on: AI #109: Google Fails Marketing Forever
Which is weird, if you are overwhelmed shouldn’t you also be excited or impressed? I guess not, which seems like a mistake, exciting things are happening.
“Impressed” or “excited” implies a positive/approving emotion towards the overwhelming news coming from the AI sphere. As an on-the-nose comparison, you would not be “impressed” or “excited” by a constant stream of reports covering how quickly an invading army is managing to occupy your cities, even if the new military hardware they deploy is “impressive” in a strictly technical sense.

Thane Ruthenis Mar 25, 2025, 9:22 PM
5 points
0
in reply to: KatWoods’s comment on: A Bear Case: My Predictions Regarding AI Progress
When reading LLM outputs, I tend to skim them. They’re light on relevant, non-obvious content. You can usually just kind of glance diagonally through their text and get the gist, because they tend to spend a lot of words saying nothing/repeating themselves/saying obvious inanities or extensions of what they’ve already said.
When I first saw Deep Research outputs, it didn’t read to me like this. Every sentence seemed to be insightful, dense with pertinent information.
Now I’ve adjusted to the way Deep Research phrases itself, and it reads same as any other LLM output. Too many words conveying too few ideas.
Not to say plenty of human writing isn’t similar kind of slop, and not to say some LLM outputs aren’t actually information-dense. But well-written human stuff is usually information-dense, and could have surprising twists of thought or rhetoric that demand you to actually properly read it. And LLM outputs – including, as it turns out, Deep Research’s – are usually very water-y.