Nick_Tarleton comments on Views on when AGI comes and on strategy to reduce existential risk

Nick_Tarleton Feb 2, 2025, 11:07 PM
LW: 6 AF: 3
3
AF
It seems right to me that “fixed, partial concepts with fixed, partial understanding” that are “mostly ‘in the data’” likely block LLMs from being AGI in the sense of this post. (I’m somewhat confused / surprised that people don’t talk about this more — I don’t know whether to interpret that as not noticing it, or having a different ontology, or noticing it but disagreeing that it’s a blocker, or thinking that it’ll be easy to overcome, or what. I’m curious if you have a sense from talking to people.)
These also seem right
- “LLMs have a weird, non-human shaped set of capabilities”
- “There is a broken inference”
- “we should also update that this behavior surprisingly turns out to not require as much general intelligence as we thought”
- “LLMs do not behave with respect to X like a person who understands X, for many X”
(though I feel confused about how to update on the conjunction of those, and the things LLMs are good at — all the ways they don’t behave like a person who doesn’t understand X, either, for many X.)
But: you seem to have a relatively strong prior^[1] on how hard it is to get from current techniques to AGI, and I’m not sure where you’re getting that prior from. I’m not saying I have a strong inside view in the other direction, but, like, just for instance — it’s really not apparent to me that there isn’t a clever continuous-training architecture, requiring relatively little new conceptual progress, that’s sufficient; if that’s less sample-efficient than what humans are doing, it’s not apparent to me that it can’t still accomplish the same things humans do, with a feasible amount of brute force. And it seems like that is apparent to you.
Or, looked at from a different angle: to my gut, it seems bizarre if whatever conceptual progress is required takes multiple decades, in the world I expect to see with no more conceptual progress, where probably:
- AI is transformative enough to motivate a whole lot of sustained attention on overcoming its remaining limitations
- AI that’s narrowly superhuman on some range of math & software tasks can accelerate research
1. ^
  It’s hard for me to tell how strong: “—though not super strongly” is hard for me to square with your butt-numbers, even taking into account that you disclaim them as butt-numbers.
- TsviBT Feb 3, 2025, 2:40 AM
  LW: 7 AF: 4
  1
  AF Parent
  I’m curious if you have a sense from talking to people.
  
  More recently I’ve mostly disengaged (except for making kinda-shrill LW comments). Some people say that “concepts” aren’t a thing, or similar. E.g. by recentering on performable tasks, by pointing to benchmarks going up and saying that the coarser category of “all benchmarks” or similar is good enough for predictions. (See e.g. Kokotajlo’s comment here https://www.lesswrong.com/posts/oC4wv4nTrs2yrP5hz/what-are-the-strongest-arguments-for-very-short-timelines?commentId=QxD5DbH6fab9dpSrg, though his actual position is of course more complex and nuanced.) Some people say that the training process is already concept-gain-complete. Some people say that future research, such as “curiosity” in RL, will solve it. Some people say that the “convex hull” of existing concepts is already enough to set off FURSI (fast unbounded recursive self-improvement).
  
  (though I feel confused about how to update on the conjunction of those, and the things LLMs are good at — all the ways they don’t behave like a person who doesn’t understand X, either, for many X.)
  
  True; I think I’ve heard some various people discussing how to more precisely think of the class of LLM capabilities, but maybe there should be more.
  
  if that’s less sample-efficient than what humans are doing, it’s not apparent to me that it can’t still accomplish the same things humans do, with a feasible amount of brute force
  
  It’s often awkward discussing these things, because there’s sort of a “seeing double” that happens. In this case, the “double” is:
  
  “AI can’t FURSI because it has poor sample efficiency...
  1. ...and therefore it would take k orders of magnitude more data / compute than a human to do AI research.”
  2. ...and therefore more generally we’ve not actually gotten that much evidence that the AI has the algorithms which would have caused both good sample efficiency and also the ability to create novel insights / skills / etc.”
  The same goes mutatis mutandis for “can make novel concepts”.
  
  I’m more saying 2. rather than 1. (Of course, this would be a very silly thing for me to say if we observed the gippities creating lots of genuine novel useful insights, but with low sample complexity (whatever that should mean here). But I would legit be very surprised if we soon saw a thing that had been trained on 1000x less human data, and performs at modern levels on language tasks (allowing it to have breadth of knowledge that can be comfortably fit in the training set).)
  
  can’t still accomplish the same things humans do
  
  Well, I would not be surprised if it can accomplish a lot of the things. It already can of course. I would be surprised if there weren’t some millions of jobs lost in the next 10 years from AI (broadly, including manufacturing, driving, etc.). In general, there’s a spectrum/space of contexts / tasks, where on the one hand you have tasks that are short, clear-feedback, and common / stereotyped, and not that hard; on the other hand you have tasks that are long, unclear-feedback, uncommon / heterogenous, and hard. The way humans do things is that we practice the short ones in some pattern to build up for the variety of long ones. I expect there to be a frontier of AIs crawling from short to long ones. I think at any given time, pumping in a bunch of brute force can expand your frontier a little bit, but not much, and it doesn’t help that much with more permanently ratcheting out the frontier.
  
  AI that’s narrowly superhuman on some range of math & software tasks can accelerate research
  
  As you’re familiar with, if you have a computer program that has 3 resources bottlenecks A (50%), B (25%), and C (25%), and you optimize the fuck out of A down to ~1%, you ~double your overall efficiency; but then if you optimize the fuck out of A again down to .1%, you’ve basically done nothing. The question to me isn’t “does AI help a significant amount with some aspects of AI research”, but rather “does AI help a significant and unboundedly growing amount with all aspects of AI research, including the long-type tasks such as coming up with really new ideas”.
  
  AI is transformative enough to motivate a whole lot of sustained attention on overcoming its remaining limitations
  
  This certainly makes me worried in general, and it’s part of why my timelines aren’t even longer; I unfortunately don’t expect a large “naturally-occurring” AI winter.
  
  seems bizarre if whatever conceptual progress is required takes multiple decades
  
  Unfortunately I haven’t addressed your main point well yet… Quick comments:
  - Strong minds are the most structurally rich things ever. That doesn’t mean they have high algorithmic complexity; obviously brains are less algorithmically complex than entire organisms, and the relevant aspects of brains are presumably considerably simpler than actual brains. But still, IDK, it just seems weird to me to expect to make such an object “by default” or something? Craig Venter made a quasi-synthetic lifeform—but how long would it take us to make a minimum viable unbounded invasive organic replicator actually from scratch, like without copying DNA sequences from existing lifeforms?
  - I think my timelines would have been considered normalish among X-risk people 15 years ago? And would have been considered shockingly short by most AI people.
  - I think most of the difference is in how we’re updating, rather than on priors? IDK.