Rafael Harth

Karma: 4,862

I’m an independent researcher currently working on a sequence of posts about consciousness. You can send me anonymous feedback here: https://www.admonymous.co/rafaelharth. If it’s about a post, you can add [q] or [nq] at the end if you want me to quote or not quote it in the comment section.

Rafael Harth Mar 7, 2025, 2:05 PM
8 points
2
in reply to: Daniel Kokotajlo’s comment on: A Bear Case: My Predictions Regarding AI Progress

A larger number of people, I think, desperately desperately want LLMs to be a smaller deal than what they are.

Can confirm that I’m one of these people (and yes, I worry a lot about this clouding my judgment).

Rafael Harth Mar 6, 2025, 11:22 PM
2 points
0
in reply to: TAG’s comment on: Why it’s so hard to talk about Consciousness

Again, those are theories of consciousness, not definitions of consciousness.

I would agree that people who use consciousness to denote the computational process vs. the fundamental aspect generally have different theories of consciousness, but they’re also using the term to denote two different things.

(I think this is bc consciousness notably different from other phenomena—e.g., fiber decreasing risk of heart disease—where the phenomenon is relatively uncontroversial and only the theory about how the phenomenon is explained is up for debate. With consciousness, there are a bunch of “problems” about which people debate whether they’re even real problems at all (e.g., binding problem, hard problem). Those kinds of disagreements are likely causally upstream of inconsistent terminology.)

Rafael Harth Feb 26, 2025, 12:01 PM
12 points
5
in reply to: james oofou’s comment on: Have LLMs Generated Novel Insights?

I think the ability to autonomously find novel problems to solve will emerge as reasoning models scale up. It will emerge because it is instrumental to solving difficult problems.

This of course is not a sufficient reason. (Demonstration: telepathy will emerge [as evolution improves organisms] because it is instrumental to navigating social situations.) It being instrumental means that there is an incentive—or to be more precise, a downward slope in the loss function toward areas of model space with that property—which is one required piece, but it also must be feasible. E.g., if the parameter space doesn’t have any elements that are good at this ability, then it doesn’t matter whether there’s a downward slope.

Fwiw I agree with this:

Current LLMs are capable of solving novel problems when the user does most the work: when the user lays the groundwork and poses the right question for the LLM to answer.

… though like you I think posing the right question is the hard part, so imo this is not very informative.

Rafael Harth Feb 25, 2025, 6:33 PM
LW: 9 AF: 3
2
AF
on: Have LLMs Generated Novel Insights?
Instead of “have LLMs generated novel insights”, how about “have LLMs demonstrated the ability to identify which views about a non-formal topic make more or less sense?” This question seems easier to operationalize and I suspect points at a highly related ability.

Rafael Harth Feb 24, 2025, 7:26 PM
2 points
0
in reply to: Yair Halberstadt’s comment on: Yair Halberstadt’s Shortform
Fwiw this is the kind of question that has definitely been answered in the training data, so I would not count this as an example of reasoning.

Rafael Harth Feb 18, 2025, 9:18 PM
7 points
6
on: The Unearned Privilege We Rarely Discuss: Cognitive Capability
I’m just not sure the central claim, that rationalists underestimate the role of luck in intelligence, is true. I’ve never gotten that impression. At least my assumption going into reading this was already that intelligence was probably 80-90% unearned.

Rafael Harth Feb 17, 2025, 2:44 PM
3 points
0
in reply to: Lukas_Gloor’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3

Humans must have gotten this ability from somewhere and it’s unlikely the brain has tons of specialized architecture for it.

This is probably a crux; I think the brain does have tons of specialized architecture for it, and if I didn’t believe that, I probably wouldn’t think thought assessment was as difficult.

The thought generator seems more impressive/fancy/magic-like to me.

Notably people’s intuitions about what is impressive/difficult tend to be inversely correlated with reality. The stereotype is (or at least used to be) that AI will be good at rationality and reasoning but struggle with creativity, humor, and intuition. This stereotype contains information since inverting it makes better-than-chance predictions about what AI has been good at so far, especially LLMs.

I think this is not a coincidence but roughly because people use “degree of conscious access” an inverse proxy for intuitive difficulty. The more unconscious something is, the more it feels like we don’t know how it works, the more difficult it intuitively seems. But I suspect degree of conscious access positively correlates with difficulty.

If sequential reasoning is mostly a single trick, things should get pretty fast now. We’ll see soon? :S

Yes; I think the “single trick” view might be mostly confirmed or falsified in as little as 2-3 years. (If I introspect I’m pretty confident that I’m not wrong here, the scenario that frightens me is more that sequential reasoning improves non-exponentially but quickly, which I think could still mean doom, even if it takes 15 years. Those feel like short timelines to me.)

Rafael Harth Feb 16, 2025, 10:21 PM
4 points
0
in reply to: Pekka Puupaa’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3

Whether or not every interpretation needs a way to connect measurements to conscious experiences, or whether they need extra machinery?

If we’re being extremely pedantic, then then KC is about predicting conscious experience (or sensory input data, if you’re an illusionist; one can debate what the right data type is). But this only matters for discussing things like Boltzmann brains. As soon as you assume that there exists an external universe, you can forget about your personal experience just try to estimate the length of the program that runs the universe.

So practically speaking, it’s the first one. I think what quantum physics does with observers falls out of the math and doesn’t require any explicit treatment. I don’t think Copenhagen gets penalized for this, either. The wave function collapse increases complexity because it’s an additional rule that changes how the universe operates, not because it has anything to do with observers. (As I mentioned, I think the ‘good’ version of Copenhagen doesn’t mention observers, anyway.)

If you insist on the point that interpretation relates to an observer, then I’d just say that “interpretation of quantum mechanics” is technically a misnomer. It should just be called “theory of quantum mechanics”. Interpretations don’t have KC; theories do. We’re comparing different source codes for the universe.

steelmanning

I think this argument is analogous to giving white credit for this rook check, which is fact a good move that allows white to win the queen next move—when in actual fact white just didn’t see that the square was protected and blundered a rook. The existence of the queen-winning tactic increases the objective evaluation of the move, but once you know that white didn’t see it, it should not increase your skill estimate of white. You should judge the move as if the tactic didn’t exist.

Similarly, the existence of a way to salvage the argument might make the argument better in the abstract, but should not influence your assessment of DeepSeek’s intelligence, provided we agree that DeepSeek didn’t know it existed. In general, you should never give someone credit for areas of the chess/argument tree that they didn’t search.

Rafael Harth Feb 16, 2025, 2:09 PM
4 points
0
in reply to: Pekka Puupaa’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3

The reason we can expect Copenhagen-y interpretations to be simpler than other interpretations is because every other interpretation also needs a function to connect measurements to conscious experiences, but usually requires some extra machinery in addition to that.

I don’t believe this is correct. But I separately think that it being correct would not make DeepSeek’s answer any better. Because that’s not what it said, at all. A bad argument does not improve because there exists a different argument that shares the same conclusion.

Rafael Harth Feb 15, 2025, 8:27 PM
2 points
0
in reply to: Pekka Puupaa’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3
Here’s my take; not a physicist.

So in general, what DeepSeek says here might align better with intuitive complexity, but the point of asking about Kolmogorov Complexity rather than just Occam’s Razor is that we’re specifically trying to look at formal description length and not intuitive complexity.

Many Worlds does not need extra complexity to explain the branching. The branching happens due to the part of the math that all theories agree on. (In fact, I think a more accurate statement is that the branching is a description of what the math does.)

Then there’s the wavefunction collapse. So first of all, wavefunction collapse is an additional postulate not contianed in the remaining math, so it adds complexity. (… and the lack of the additional postulate does not add complexity, as DeepSeek claimed.) And then there’s a separate issue with KC arguably being unable to model randomness at all. You could argue that this is a failure of the KC formalism and we need KC + randomness oracle to even answer the question. You could also be hardcore about it and argue that any nondeterministic theory is impossible to describe and therefore has KC $\infty$ . In either case, the issue of randomness is something you should probably bring up in response to the question.

And finally there’s the observer role. Iiuc the less stupid versions of Copenhagen do not give a special role to an observer; there’s a special role for something being causally entangled with the experiment’s result, but it doesn’t have to be an agent. This is also not really a separate principle from the wave function collapse I don’t think, it’s what triggers collapse. And then it doesn’t make any sense to list as a strength of Copenhagen because if anything it increases description length.

There are variants of KC that penalize the amount of stuff that are created rather than just the the description length I believe, in which case MW would have very high KC. This is another thing DeepSeek could have brought up.

Rafael Harth Feb 14, 2025, 3:59 PM
4 points
0
in reply to: Steven Byrnes’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3

[...] I personally wouldn’t use the word ‘sequential’ for that—I prefer a more vertical metaphor like ‘things building upon other things’—but that’s a matter of taste I guess. Anyway, whatever we want to call it, humans can reliably do a great many steps, although that process unfolds over a long period of time.

…And not just smart humans. Just getting around in the world, using tools, etc., requires giant towers of concepts relying on other previously-learned concepts.

As a clarification for anyone wondering why I didn’t use a framing more like this in the post, it’s because I think these types of reasoning (horizontal and vertical/A and C) are related in an important way, even though I agree that C might be qualitatively harder than A (hence section §3.1). Or to put it differently, if one extreme position is “we can look entirely at A to extrapolate LLM performance into the future” and the other is “A and C are so different that progress on A is basically uninteresting”, then my view is somewhere near the middle.

Rafael Harth Feb 14, 2025, 2:56 PM
4 points
0
in reply to: ChristianKl’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3

It’s not clear to me that an human, using their brain and a go board for reasoning could beat AlphaZero even if you give them infinite time.

I agree but I dispute that this example is relevant. I don’t think there is any step in between “start walking on two legs” to “build a spaceship” that requires as much strictly-type-A reasoning as beating AlphaZero at go or chess. This particular kind of capability class doesn’t seem to me to be very relevant.

Also, to the extent that it is relevant, a smart human with infinite time could outperform AlphaGo by programming a better chess/go computer. Which may sound silly but I actually think it’s a perfectly reasonable reply—using narrow AI to assist in brute-force cognitive tasks is something humans are allowed to do. And it’s something that LLMs are also allowed to do; if they reach superhuman performance on general reasoning, and part of how they do this is by writing python scripts for modular subproblems, then we wouldn’t say that this doesn’t count.

Rafael Harth Feb 14, 2025, 2:04 PM
4 points
2
in reply to: Davidmanheim’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3
I do think the human brain uses two very different algorithms/architectures for thought generation and assessment. But this falls within the “things I’m not trying to justify in this post” category. I think if you reject the conclusion based on this, that’s completely fair. (I acknowledged in the post that the central claim has a shaky foundation. I think the model should get some points because it does a good job retroactively predicting LLM performance—like, why LLMs aren’t already superhuman—but probably not enough points to convince anyone.)

Rafael Harth Feb 14, 2025, 2:00 PM
5 points
2
in reply to: ryan_greenblatt’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3
I don’t think a doubling every 4 or 6 months is plausible. I don’t think a doubling on any fixed time is plausible because I don’t think overall progress will be exponential. I think you could have exponential progress on thought generation, but this won’t yield exponential progress on performance. That’s what I was trying to get at with this paragraph:

My hot take is that the graphics I opened the post with were basically correct in modeling thought generation. Perhaps you could argue that progress wasn’t quite as fast as the most extreme versions predicted, but LLMs did go from subhuman to superhuman thought generation in a few years, so that’s pretty fast. But intelligence isn’t a singular capability; it’s ~~two capabilities~~ a phenomenon better modeled as two capabilities, and increasing just one of them happens to have sub-linear returns on overall performance.

So far (as measured by the 7card puzzle, which It think is a fair data point) I think we went from ‘no sequential reasoning whatsoever’ to ‘attempted sequential reasoning but basically failed’ (Jun13 update) to now being able to do genuine sequential reasoning for the first time. And if you look at how DeepSeek does it, to me this looks like the kind of thing where I expect difficulty to grow exponentially with argument length. (Based on stuff like it constantly having to go back and double checking even when it got something right.)

What I’d expect from this is not a doubling every N months, but perhaps an ability to reliably do one more step every N months. I think this translates into more above-constant returns on the “horizon length” scale—because I think humans need more than 2x time for 2x steps—but not exponential returns.

Rafael Harth Feb 14, 2025, 1:15 PM
2 points
0
in reply to: plex’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3
This is true but I don’t think it really matters for eventual performance. If someone thinks about a problem for a month, the number of times they went wrong on reasoning steps during the process barely influences the eventual output. Maybe they take a little longer. But essentially performance is relatively insensitive to errors if the error-correcting mechanism is reliable.

I think this is actually a reason why most benchmarks are misleading (humans make mistakes there, and they influence the rating).

Rafael Harth Feb 14, 2025, 10:50 AM
4 points
−7
in reply to: Davidmanheim’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3
If thought assessment is as hard as thought generation and you need a thought assessor to get AGI (two non-obvious conditionals), then how do you estimate the time to develop a thought assessor? From which point on do you start to measure the amount of time it took to come up with the transformer architecture?

The snappy answer would be “1956 because that’s when AI started; it took 61 years to invent the transformer architecture that lead to thought generation, so the equivalent insight for thought assessment will take about 61 years”. I don’t think that’s the correct answer, but neither is “2019 because that’s when AI first kinda resembled AGI”.

Rafael Harth Feb 13, 2025, 9:46 PM
2 points
0
in reply to: Gordon Seidoh Worley’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3
I generally think that [autonomous actions due to misalignment] and [human misuse] are distinct categories with pretty different properties. The part you quoted addresses the former (as does most of the post). I agree that there are scenarios where the second is feasible and the first isn’t. I think you could sort of argue that this falls under AIs enhancing human intelligence.

Rafael Harth Feb 13, 2025, 9:43 PM
4 points
−8
in reply to: ryan_greenblatt’s comment on: ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3
So, I agree that there has been substantial progress in the past year, hence the post title. But I think if you naively extrapolate that rate of progress, you get around 15 years.

The problem with the three examples you’ve mentioned is again that they’re all comparing human cognitive work across a short amount of time with AI performance. I think the relevant scale doesn’t go from 5th grade performance over 8th grade performance to university-level performance or whatever, but from “what a smart human can do in 5 minutes” over “what a smart human can do in an hour” over “what a smart human can do in a day”, and so on.

I don’t know if there is an existing benchmark that measures anything like this. (I agree that more concrete examples would improve the post, fwiw.)

And then a separate problem is that math problems are in in the easiest category from §3.1 (as are essentially all benchmarks).

≤10-year Timelines Remain Unlikely Despite DeepSeek and o3

Rafael HarthFeb 13, 2025, 7:21 PM

52 points

52 comments15 min readLW link

Rafael Harth Feb 13, 2025, 4:51 PM
3 points
0
in reply to: Mitchell_Porter’s comment on: Those of you with lots of meditation experience: How did it influence your understanding of philosophy of mind and topics such as qualia?
I don’t the experience of no-self contradicts any of the above.

In general, I think you could probably make some factual statements about the nature of consciousness that’s true and that you learn from attaining no-self, if you phrased it very carefully, but I don’t think that’s the point.

The way I’d phrase what happens would be mostly in terms of attachment. You don’t feel as implicated by things that affect you anymore, you have less anxiety, that kind of thing. I think a really good analogy is just that regular consciousness starts to resemble consciousness during a flow state.

Rafael Harth

≤10-year Timelines Re­main Un­likely De­spite Deep­Seek and o3

≤10-year Timelines Remain Unlikely Despite DeepSeek and o3