Views on when AGI comes and on strategy to reduce existential risk
Summary: AGI isn’t super likely to come super soon. People should be working on stuff that saves humanity in worlds where AGI comes in 20 or 50 years, in addition to stuff that saves humanity in worlds where AGI comes in the next 10 years.
Thanks to Alexander Gietelink Oldenziel, Abram Demski, Daniel Kokotajlo, Cleo Nardo, Alex Zhu, and Sam Eisenstat for related conversations.
My views on when AGI comes
AGI
By “AGI” I mean the thing that has very large effects on the world (e.g., it kills everyone) via the same sort of route that humanity has large effects on the world. The route is where you figure out how to figure stuff out, and you figure a lot of stuff out using your figure-outers, and then the stuff you figured out says how to make powerful artifacts that move many atoms into very specific arrangements.
This isn’t the only thing to worry about. There could be transformative AI that isn’t AGI in this sense. E.g. a fairly-narrow AI that just searches configurations of atoms and finds ways to do atomically precise manufacturing would also be an existential threat and a possibility for an existential win.
Conceptual capabilities progress
The “conceptual AGI” view:
The first way humanity makes AGI is by combining some set of significant ideas about intelligence. Significant ideas are things like (the ideas of) gradient descent, recombination, probability distributions, universal computation, search, world-optimization. Significant ideas are to a significant extent bottlenecked on great natural philosophers doing great natural philosophy about intelligence, with sequential bottlenecks between many insights.
The conceptual AGI doesn’t claim that humanity doesn’t already have enough ideas to make AGI. I claim that——though not super strongly.
Timelines
Giving probabilities here doesn’t feel great. For one thing, it seems to contribute to information cascades and to shallow coalition-forming. For another, it hides the useful models. For yet another thing: A probability bundles together a bunch of stuff I have models about, with a bunch of stuff I don’t have models about. For example, how many people will be doing original AGI-relevant research in 15 years? I have no idea, and it seems like largely a social question. The answer to that question does affect when AGI comes, though, so a probability about when AGI comes would have to depend on that answer.
But ok. Here’s some butt-numbers:
3%-10% probability of AGI in the next 10-15ish years. This would be lower, but I’m putting a bit of model uncertainty here.
40%-45% probability of AGI in the subsequent 45ish years. This is denser than the above because, eyeballing the current state of the art, it seems like we currently lack some ideas we’d need——but I don’t know how many insights would be needed, so the remainder could be only a couple decades around the corner. It also seems like people are distracted now.
Median 2075ish. IDK. This would be further out if an AI winter seemed more likely, but LLMs seem like they should already be able to make a lot of money.
A long tail. It’s long because of stuff like civilizational collapse, and because AGI might be really really hard to make. There’s also a sliver of a possibility of coordinating for a long time to not make AGI.
If I were trying to make a model with parts, I might try starting with a mixture of Erlang distributions of different shapes, and then stretching that according to some distribution about the number of people doing original AI research over time.
Again, this is all butt-numbers. I have almost no idea about how much more understanding is needed to make AGI, except that it doesn’t seem like we’re there yet.
Responses to some arguments for AGI soon
The “inputs” argument
At about 1:15 in this interview, Carl Shulman argues (quoting from the transcript):
We’ve been scaling [compute expended on ML] up four times as fast as was the case for most of the history of AI. We’re running through the orders of magnitude of possible resource inputs you could need for AI much much more quickly than we were for most of the history of AI. That’s why this is a period with a very elevated chance of AI per year because we’re moving through so much of the space of inputs per year [...].
This isn’t the complete argument Shulman gives, but on its own it’s interesting. On its own, it’s valid, but only if we’re actually scaling up all the needed inputs.
On the conceptual AGI view, this isn’t the case, because we aren’t very greatly increasing the amount of great natural philosophers doing great natural philosophy about intelligence. That’s a necessary input, and it’s only being somewhat scaled up. For one thing, many new AI researchers are correlated with each other, and many are focused on scaling up, applying, and varying existing ideas. For another thing, sequential progress can barely be sped up with more bodies.
The “big evolution” argument
Carl goes on to argue that eventually, when we have enough compute, we’ll be able to run a really big evolutionary process that finds AGIs (if we haven’t already made AGI). This idea also appears in Ajeya Cotra’s report on the compute needed to create AGI.
I broadly agree with this. But I have two reasons that this argument doesn’t make AGI seem very likely very soon.
The first reason is that running a big evolution actually seems kind of hard; it seems to take significant conceptual progress and massive engineering effort to make the big evolution work. What I’d expect to see when this is tried, is basically nothing; life doesn’t get started, nothing interesting happens, the entities don’t get far (beyond whatever primitives were built in). You can get around this by invoking more compute, e.g. by simulating physics more accurately at a more detailed level, or by doing hyperparameter search to find worlds that lead to cool stuff. But then you’re invoking more compute. (I’d also expect a lot of the hacks that supposedly make our version of evolution much more efficient than real evolution, to actually result in our version being circumscribed, i.e. it peters out because the shortcut that saved compute also cut off some important dimensions of search.)
The second reason is that evolution seems to take a lot of serial time. There’s probably lots of clever things one can do to shortcut this, but these would be significant conceptual progress.
“I see how to do it”
My (limited / filtered) experience with these ideas leads me to think that [ideas knowably sufficient to make an AGI in practice] aren’t widespread or obvious. (Obviously it is somehow feasible to make an AGI, because evolution did it.)
The “no blockers” intuition
An intuition that I often encounter is something like this:
Previously, there were blockers to current systems being developed into AGI. But now those blockers have been solved, so AGI could happen any time now.
This sounds to my ears like: “I saw how to make AGI, but my design required X. Then someone made X, so now I have a design for an AGI that will work.”. But I don’t think that’s what they think. I think they don’t think they have to have a design for an AGI in order to make an AGI.
I kind of agree with some version of this——there’s a lot of stuff you don’t have to understand, in order to make something that can do some task. We observe this in modern ML. But current systems, though they impressively saturate some lower-dimensional submanifold of capability-space, don’t permeate a full-dimensional submanifold. Intelligence is a positive thing. Most computer code doesn’t put itself on an unbounded trajectory of gaining capabilities. To make it work you have to do engineering and science, at some level. Bridges don’t hold weight just because there’s nothing blocking them from holding weight.
Daniel Kokotajlo points out that for things that grow, it’s kind of true that they’ll succeed as long as there aren’t blockers——and for example animal husbandry kind of just works, without the breeders understanding much of anything about the internals of why their selection pressures are met with adequate options to select. This is true, but it doesn’t seem very relevant to AGI because we’re not selecting from an existing pool of highly optimized “genomic” (that is, mental) content. If instead of tinkering with de novo gradient-searched circuits, we were tinkering with remixing and mutating whole-brain emulations, then I would think AGI comes substantially sooner.
Another regime where “things just work” is many mental contexts where a task is familiar enough in some way that you can expect to succeed at the task by default. For example, if you’re designing a wadget, and you’ve previously designed similar wadgets to similar specifications, then it makes sense to treat a design idea as though it’s going to work out——as though it can be fully fleshed out into a satisfactory, functioning design——unless you see something clearly wrong with it, a clear blocker like a demand for a metal with unphysical properties. Again, like the case of animal husbandry, the “things just work” comes from the (perhaps out of sight) preexisting store of optimized content that’s competent to succeed at the task given a bit of selection and arrangement. In the case of AGI, no one’s ever built anything like that, so the store of knowledge that would automatically flesh out blockerless AGI ideas is just not there.
Yet another such regime is markets, where the crowd of many agents can be expected to figure out how to do something as long as it’s feasible. So, a version of this intuition goes:
There are a lot of people trying to make AGI. So either there’s some strong blocker that makes it so that no one can make AGI, or else someone will make AGI.
This is kind of true, but it just goes back to the question of how much conceptual progress will people make towards AGI. It’s not an argument that we already have the understanding needed to make AGI. If it’s used as an argument that we already have the understanding, then it’s an accounting mistake: it says “We already have the understanding. The reason we don’t need more understanding, is that if there were more understanding needed, someone else will figure it out, and then we’ll have it. Therefore no one needs to figure anything else out.”.
Finally: I also see a fair number of specific “blockers”, as well as some indications that existing things don’t have properties that would scare me.
“We just need X” intuitions
Another intuition that I often encounter is something like this:
We just need X to get AGI. Once we have X, in combination with Y it will go all the way.
Some examples of Xs: memory, self-play, continual learning, curricula, AIs doing AI research, learning to learn, neural nets modifying their own weights, sparsity, learning with long time horizons.
For example: “Today’s algorithms can learn anything given enough data. So far, data is limited, and we’re using up what’s available. But self-play generates infinite data, so our systems will be able to learn unboundedly. So we’ll get AGI soon.”.
This intuition is similar to the “no blockers” intuition, and my main response is the same: the reason bridges stand isn’t that you don’t see a blocker to them standing. See above.
A “we just need X” intuition can become a “no blockers” intuition if someone puts out an AI research paper that works out some version of X. That leads to another response: just because an idea is, at a high level, some kind of X, doesn’t mean the idea is anything like the fully-fledged, generally applicable version of X that one imagines when describing X.
For example, suppose that X is “self-play”. One important thing about self-play is that it’s an infinite source of data, provided in a sort of curriculum of increasing difficulty and complexity. Since we have the idea of self-play, and we have some examples of self-play that are successful (e.g. AlphaZero), aren’t we most of the way to having the full power of self-play? And isn’t the full power of self-play quite powerful, since it’s how evolution made AGI? I would say “doubtful”. The self-play that evolution uses (and the self-play that human children use) is much richer, containing more structural ideas, than the idea of having an agent play a game against a copy of itself.
Most instances of a category are not the most powerful, most general instances of that category. So just because we have, or will soon have, some useful instances of a category, doesn’t strongly imply that we can or will soon be able to harness most of the power of stuff in that category. I’m reminded of the politician’s syllogism: “We must do something. This is something. Therefore, we must do this.”.
The bitter lesson and the success of scaling
Sutton’s bitter lesson, paraphrased:
AI researchers used to focus on coming up with complicated ideas for AI algorithms. They weren’t very successful. Then we learned that what’s successful is to leverage computation via general methods, as in deep learning and massive tree search.
Some add on:
And therefore what matters in AI is computing power, not clever algorithms.
This conclusion doesn’t follow. Sutton’s bitter lesson is that figuring out how to leverage computation using general methods that scale with more computation beats trying to perform a task by encoding human-learned specific knowledge about the task domain. You still have to come up with the general methods. It’s a different sort of problem——trying to aim computing power at a task, rather than trying to work with limited computing power or trying to “do the task yourself”——but it’s still a problem. To modify a famous quote: “In some ways we feel we are as bottlenecked on algorithmic ideas as ever, but we believe we are bottlenecked on a higher level and about more important things.”
Large language models
Some say:
LLMs are already near-human and in many ways super-human general intelligences. There’s very little left that they can’t do, and they’ll keep getting better. So AGI is near.
This is a hairy topic, and my conversations about it have often seemed not very productive. I’ll just try to sketch my view:
The existence of today’s LLMs is scary and should somewhat shorten people’s expectations about when AGI comes.
LLMs have fixed, partial concepts with fixed, partial understanding. An LLM’s concepts are like human concepts in that they can be combined in new ways and used to make new deductions, in some scope. They are unlike human concepts in that they won’t grow or be reforged to fit new contexts. So for example there will be some boundary beyond which a trained LLM will not recognize or be able to use a new analogy; and this boundary is well within what humans can do.
An LLM’s concepts are mostly “in the data”. This is pretty vague, but I still think it. A number of people who think that LLMs are basically already AGI have seemed to agree with some version of this, in that when I describe something LLMs can’t do, they say “well, it wasn’t in the data”. Though maybe I misunderstand them.
When an LLM is trained more, it gains more partial concepts.
However, it gains more partial concepts with poor sample efficiency; it mostly only gains what’s in the data.
In particular, even if the LLM were being continually trained (in a way that’s similar to how LLMs are already trained, with similar architecture), it still wouldn’t do the thing humans do with quickly picking up new analogies, quickly creating new concepts, and generally reforging concepts.
LLMs don’t have generators that are nearly as powerful as the generators of human understanding. The stuff in LLMs that seems like it comes in a way that’s similar to how stuff in humans comes, actually comes from a lot more data. So LLMs aren’t that much of indication that we’ve figured out how to make things that are on an unbounded trajectory of improvement.
LLMs have a weird, non-human shaped set of capabilities. They go much further than humans on some submanifold, and they barely touch some of the full manifold of capabilities. (They’re “unbalanced” in Cotra’s terminology.)
There is a broken inference. When talking to a human, if the human emits certain sentences about (say) category theory, that strongly implies that they have “intuitive physics” about the underlying mathematical objects. They can recognize the presence of the mathematical structure in new contexts, they can modify the idea of the object by adding or subtracting properties and have some sense of what facts hold of the new object, and so on. This inference——emitting certain sentences implies intuitive physics——doesn’t work for LLMs.
The broken inference is broken because these systems are optimized for being able to perform all the tasks that don’t take a long time, are clearly scorable, and have lots of data showing performance. There’s a bunch of stuff that’s really important——and is a key indicator of having underlying generators of understanding——but takes a long time, isn’t clearly scorable, and doesn’t have a lot of demonstration data. But that stuff is harder to talk about and isn’t as intuitively salient as the short, clear, demonstrated stuff.
Vaguely speaking, I think stable diffusion image generation is comparably impressive to LLMs, but LLMs seem even more impressive to some people because LLMs break the performance → generator inference more. We’re used to the world (and computers) creating intricate images, but not creating intricate texts.
There is a missing update. We see impressive behavior by LLMs. We rightly update that we’ve invented a surprisingly generally intelligent thing. But we should also update that this behavior surprisingly turns out to not require as much general intelligence as we thought.
Other comments on AGI soon
There’s a seemingly wide variety of reasons that people I talk to think AGI comes soon. This seems like evidence for each of these hypotheses: that AGI comes soon is overdetermined; that there’s one underlying crux (e.g.: algorithmic progress isn’t needed to make AGI) that I haven’t understood yet; that I talked to a heavily selected group of people (true); that people have some other reason for saying that AGI comes soon, and then rationalize that proposition.
I’m somewhat concerned that people are being somewhat taken in by hype (experiments systematically misinterpreted by some; the truth takes too long to put on its pants, and the shared narrative is already altered).
I’m kind of baffled that people are so willing to say that LLMs understand X, for various X. LLMs do not behave with respect to X like a person who understands X, for many X.
I’m pretty concerned that many people are fairly strongly deferring to others, in a general sense that includes updating off of other people’s actions and vibes. Widespread deference has many dangers, which I list in “Dangers of deference”.
I’m worried that there’s a bucket error where “I think AGI comes soon.” isn’t separated from “We’re going to be motivated to work together to prevent existential risk from AGI.”.
My views on strategy
-
Alignment is really hard. No one has good reason to think any current ideas would work to make an aligned / corrigible AGI. If AGI comes, everyone dies.
-
If AGI comes in five years, everyone dies. We won’t solve alignment well enough by then. This of course doesn’t imply that AGI coming soon is less likely. However, it does mean that some people should focus on somewhat different things. Most people trying to make the world safe by solving AGI alignment should be open to trains of thought that likely will only be helpful in twenty years. There will be a lot of people who can’t help the world if AGI comes in five years; if those people are going to stress out about how they can’t help, instead they should work on stuff that helps in twenty or fifty years.
-
A consensus belief is often inaccurate, e.g. because of deference and information cascades. In that case, the consensus portfolio of strategies will be incorrect.
-
Not only that, but furthermore: Suppose there is a consensus believe, and suppose that it’s totally correct. If funders, and more generally anyone who can make stuff happen (e.g. builders and thinkers), use this totally correct consensus belief to make local decisions about where to allocate resources, and they don’t check the global margin, then they will in aggregrate follow a portfolio of strategies that is incorrect. The make-stuff-happeners will each make happen the top few things on their list, and leave the rest undone. The top few things will be what the consensus says is most important——in our case, projects that help if AGI comes within 10 years. If a project helps in 30 years, but not 10 years, then it doesn’t get any funding at all. This is not the right global portfolio; it oversaturates fast interventions and leaves slow interventions undone.
-
Because the shared narrative says AGI comes soon, there’s less shared will for projects that take a long time to help. People don’t come up with such projects, because they don’t expect to get funding; and funders go on not funding such projects, because they don’t see good ones, and they don’t particularly mind because they think AGI comes soon.
Things that might actually work
Besides the standard stuff (AGI alignment research, moratoria on capabilities research, explaining why AGI is an existential risk), here are two key interventions:
Human intelligence enhancement. Important, tractable, and neglected. Note that if alignment is hard enough that we can’t solve it in time, but enhanced humans could solve it, then making enhanced humans one year sooner is almost as valuable as making AGI come one year later.
Confrontation-worthy empathy. Important, probably tractable, and neglected.
I suspect there’s a type of deep, thorough, precise understanding that one person (the intervener) can have of another person (the intervened), which makes it so that the intervener can confront the intervened with something like “If you and people you know succeed at what you’re trying to do, everyone will die.”, and the intervened can hear this.
This is an extremely high bar. It may go beyond what’s normally called empathy, understanding, gentleness, wisdom, trustworthiness, neutrality, justness, relatedness, and so on. It may have to incorporate a lot of different, almost contradictory properties; for example, the intervener might have to at the same time be present and active in the most oppositional way (e.g., saying: I’m here, and when all is said and done you’re threatening the lives of everyone I love, and they have a right to exist) while also being almost totally diaphanous (e.g., in fact not interfering with the intervened’s own reflective processes). It may involve irreversible changes, e.g. risking innoculation effects and unilateralist commons-burning. It may require incorporating very distinct skills; e.g. being able to make clear, correct, compelling technical arguments, and also being able to hold emotional space in difficult reflections, and also being interesting and socially competent enough to get the appropriate audiences in the first place. It probably requires seeing the intervened’s animal, and the intervened’s animal’s situation, so that the intervener can avoid being a threat to the intervened’s animal, and can help the intervened reflect on other threats to their animal. Developing this ability probably requires recursing on developing difficult subskills. It probably requires to some extent thinking like a cultural-rationalist and to some extent thinking very much not like a cultural-rationalist. It is likely to have discontinuous difficulty——easy for some sorts of people, and then very difficult in new ways for other sorts of people.
Some people are working on related abilities. E.g. Circlers, authentic relaters, therapists. As far as I know (at least having some substantial experience with Circlers), these groups aren’t challenging themselves enough. Mathematicians constantly challenge themselves: when they answer one sort of question, that sort of question becomes less interesting, and they move on to thinking about more difficult questions. In that way, they encounter each fundamental difficulty eventually, and thus have likely already grappled with the mathematical aspect of a fundamental difficulty that another science encounters.
Critch talks about empathy here, though maybe with a different emphasis.
- 2023 in AI predictions by 1 Jan 2024 5:23 UTC; 107 points) (
- Gearing Up for Long Timelines in a Hard World by 14 Jul 2023 6:11 UTC; 15 points) (
- 24 Dec 2024 16:38 UTC; 15 points) 's comment on Shortform by (
- Whether LLMs “understand” anything is mostly a terminological dispute by 9 Jul 2023 3:31 UTC; 10 points) (
- 17 Sep 2023 16:33 UTC; 7 points) 's comment on How to talk about reasons why AGI might not be near? by (
- 23 Dec 2024 18:07 UTC; 7 points) 's comment on What are the strongest arguments for very short timelines? by (
- 7 Dec 2024 20:16 UTC; 6 points) 's comment on leogao’s Shortform by (
- 1 Jan 2024 18:24 UTC; 5 points) 's comment on 2023 in AI predictions by (
- 19 May 2024 4:28 UTC; 4 points) 's comment on robo’s Shortform by (
- 10 Dec 2024 22:10 UTC; 2 points) 's comment on Subskills of “Listening to Wisdom” by (
- 27 Aug 2023 9:38 UTC; 2 points) 's comment on Digital brains beat biological ones because diffusion is too slow by (
I still basically think all of this, and still think this space doesn’t understand it, and thus has an out-of-whack X-derisking portfolio.
If I were writing it today, I’d add this example about search engines from this comment https://www.lesswrong.com/posts/oC4wv4nTrs2yrP5hz/what-are-the-strongest-arguments-for-very-short-timelines?commentId=2XHxebauMi9C4QfG4 , about induction on vague categories like “has capabilities”:
I might also try to explain more how training procedures with poor sample complexity tend to not be on an unbounded trajectory.
I think if you want to convince people with short timelines (e.g., 7 year medians) of your perspective, probably the most productive thing would be to better operationalize things you expect that AIs won’t be able to do soon (but that AGI could do). As in, flesh out a response to this comment such that it is possible for someone to judge.
But ok:
Come up, on its own, with many math concepts that mathematicians consider interesting + mathematically relevant on a similar level to concepts that human mathematicians come up with.
Do insightful science on its own.
Perform at the level of current LLMs, but with 300x less training data.
But like, I wouldn’t be surprised if, say, someone trained something that performed comparably to LLMs on a wide variety of benchmarks, using much less “data”… and then when you look into it, you find that what they were doing was taking activations of the LLMs and training the smaller guy on the activations. And I’ll be like, come on, that’s not the point; you could just as well have “trained” the smaller guy by copy-pasting the weights from the LLM and claimed “trained with 0 data!!”. And you’ll be like “but we met your criterion!” and I’ll just be like “well whatever, it’s obviously not relevant to the point I was making, and if you can’t see that then why are we even having this conversation”. (Or maybe you wouldn’t do that, IDK, but this sort of thing—followed by being accused of “moving the goal posts”—is why this question feels frustrating to answer.)
¿ thoughts on the following:
solving >95% of IMO problems while never seeing any human proofs, problems, or math libraries (before being given IMO problems in base ZFC at test time). like alphaproof except not starting from a pretrained language model and without having a curriculum of human problems and in base ZFC with no given libraries (instead of being in lean), and getting to IMO combos
(I’m not sure whether I’m supposed to nitpick. If I were nitpicking I’d ask things like: Wait are you allowing it to see preexisting computer-generated proofs? What counts as computer generated? Are you allowing it to see the parts of papers where humans state and discuss propositions and just cutting out the proofs? Is this system somehow trained on a giant human text corpus, but just without the math proofs?)
But if you mean basically “the AI has no access to human math content except a minimal game environment of formal logic, plus whatever abstract priors seep in via the training algorithm+prior, plus whatever general thinking patterns in [human text that’s definitely not mathy, e.g. blog post about apricots]”, then yeah, this would be really crazy to see. My points are trying to be, not minimally hard, but at least easier-ish in some sense. Your thing seems significantly harder (though nicely much more operationalized); I think it’d probably imply my “come up with interesting math concepts”? (Note that I would not necessary say the same thing if it was >25% of IMO problems; there I’d be significantly more unsure, and would defer to you / Sam, or someone who has a sense for the complexity of the full proofs there and the canonicalness of the necessary lemmas and so on.)
I didn’t express this clearly, but yea I meant no pretraining on human text at all, and also nothing computer-generated which “uses human mathematical ideas” (beyond what is in base ZFC), but I’d probably allow something like the synthetic data generation used for AlphaGeometry (Fig. 3) except in base ZFC and giving away very little human math inside the deduction engine. I agree this would be very crazy to see. The version with pretraining on non-mathy text is also interesting and would still be totally crazy to see. I agree it would probably imply your “come up with interesting math concepts”. But I wouldn’t be surprised if like >20% of the people on LW who think A[G/S]I happens in like 2−3 years thought that my thing could totally happen in 2025 if the labs were aiming for it (though they might not expect the labs to aim for it), with your things plausibly happening later. E.g. maybe such a person would think “AlphaProof is already mostly RL/search and one could replicate its performance soon without human data, and anyway, AlphaGeometry already pretty much did this for geometry (and AlphaZero did it for chess)” and “some RL+search+self-play thing could get to solving major open problems in math in 2 years, and plausibly at that point human data isn’t so critical, and IMO problems are easier than major open problems, so plausibly some such thing gets to IMO problems in 1 year”. But also idk maybe this doesn’t hang together enough for such people to exist. I wonder if one can use this kind of idea to get a different operationalization with parties interested in taking each side though. Like, maybe whether such a system would prove Cantor’s theorem (stated in base ZFC) (imo this would still be pretty crazy to see)? Or whether such a system would get to IMO combos relying moderately less on human data?
IIUC yeah, that definitely seems fair; I’d probably also allow various other substantial “quasi-mathematical meta-ideas” to seep in, e.g. other tricks for self-generating a curriculum of training data.
Mhm, that seems quite plausible, yeah, and that does make me want to use your thing as a go-to example.
This one I feel a lot less confident of, though I could plausibly get more confident if I thought about the proof in more detail.
Part of the spirit here, for me, is something like: Yes, AIs will do very impressive things on “highly algebraic” problems / parts of problems. (See “Algebraicness”.) One of the harder things for AIs is, poetically speaking, “self-constructing its life-world”, or in other words “coming up with lots of concepts to understand the material it’s dealing with, and then transitioning so that the material it’s dealing with is those new concepts, and so on”. For any given math problem, I could be mistaken about how algebraic it is (or, how much of its difficulty for humans is due to the algebraic parts), and how much conceptual progress you have to do to get to a point where the remaining work is just algebraic. I assume that human math is a big mix of algebraic and non-algebraic stuff. So I get really surprised when an AlphaMath can reinvent most of the definitions that we use, but I’m a lot less sure about a smaller subset because I’m less sure if it just has a surprisingly small non-algebraic part. (I think that someone with a lot more sense of the math in general, and formal proofs in particular, could plausibly call this stuff in advance significantly better than just my pretty weak “it’s hard to do all of a wide variety of problems”.)
I did give a response in that comment thread. Separately, I think that’s not a great standard, e.g. as described in the post and in this comment https://www.lesswrong.com/posts/i7JSL5awGFcSRhyGF/shortform-2?commentId=zATQE3Lhq66XbzaWm :
In fact, all the time in real life we make judgements about things that we couldn’t describe in terms that would be considered well-operationalized by betting standards, and we rely on these judgements, and we largely endorse relying on these judgements. E.g. inferring intent in criminal cases, deciding whether something is interesting or worth doing, etc. I should be able to just say “but you can tell that these AIs don’t understand stuff”, and then we can have a conversation about that, without me having to predict a minimal example of something which is operationalized enough for you to be forced to recognize it as judgeable and also won’t happen to be surprisingly well-represented in the data, or surprisingly easy to do without creativity, etc.
(Yeah, you responded, but felt not that operationalized and seemed doable to flesh out as you did.)