Long-time lurker (c. 2013), recent poster. I also write on the EA Forum.
Mo Putera
Thanks, good example.
Thanks! Added to the list.
(To be honest, to first approximation my guess mirrors yours.)
Scott Alexander’s Mistakes, Dan Luu’s Major errors on this blog (and their corrections), Gwern’s My Mistakes (last updated 11 years ago), and Nintil’s Mistakes (h/t @Rasool) are the only online writers I know of who maintain a dedicated, centralized page solely for cataloging their errors, which I admire. Probably not coincidentally they’re also among the thinkers I respect the most for repeatedly empirically grounding their reasoning. Some orgs do this too, like 80K’s Our mistakes, CEA’s Mistakes we’ve made, and GiveWell’s Our mistakes.
While I prefer dedicated centralized pages like those to one-off writeups for long content benefit reasons, one-off definitely beats none (myself included). In that regard I appreciate essays like Holden Karnofsky’s Some Key Ways in Which I’ve Changed My Mind Over the Last Several Years (2016), Denise Melchin’s My mistakes on the path to impact (2020), Zach Groff’s Things I’ve Changed My Mind on This Year (2017), and this 2013 LW repository for “major, life-altering mistakes that you or others have made”, as well as by orgs like HLI’s Learning from our mistakes.
In this vein I’m also sad to see mistakes pages get removed, e.g. ACE used to have a Mistakes page (archived link) but now no longer do.
Can you say more about what you mean? Your comment reminded me of Thomas Griffiths’ paper Understanding Human Intelligence through Human Limitations, but you may have meant something else entirely.
Griffiths argued that the aspects we associate with human intelligence – rapid learning from small data, the ability to break down problems into parts, and the capacity for cumulative cultural evolution – arose from the 3 fundamental limitations all humans share: limited time, limited computation, and limited communication. (The constraints imposed by these characteristics cascade: limited time magnifies the effect of limited computation, and limited communication makes it harder to draw upon more computation.) In particular, limited computation leads to problem decomposition, hence modular solutions; relieving the computation constraint enables solutions that can be objectively better along some axis while also being incomprehensible to humans.
I’m mainly wondering how Open Phil, and really anyone who uses fraction of economically-valuable cognitive labor automated / automatable (e.g. the respondents to that 2018 survey; some folks on the forum) as a useful proxy for thinking about takeoff, tracks this proxy as a way to empirically ground their takeoff-related reasoning. If you’re one of them, I’m curious if you’d answer your own question in the affirmative?
Thanks for the pointer to that paper, the abstract makes me think there’s a sort of slow-acting self-reinforcing feedback loop between predictive error minimisation via improving modelling and via improving the economy itself.
re: weather, I’m thinking of the chart below showing how little gain we get in MAE vs compute, plus my guess that compute can’t keep growing far enough to get MAE < 3 °F a year out (say). I don’t know anything about advancements in weather modelling methods though; maybe effective compute (incorporating modelling advancements) may grow indefinitely in terms of the chart.
Visual representation of what you mean (imagine the red border doesn’t strictly dominate blue) from an AI Impacts blog post by Katja Grace:
I used to consider it a mystery that math was so unreasonably effective in the natural sciences, but changed my mind after reading this essay by Eric S. Raymond (who’s here on the forum, hi and thanks Eric), in particular this part, which is as good a question dissolution as any I’ve seen:
The relationship between mathematical models and phenomenal prediction is complicated, not just in practice but in principle. Much more complicated because, as we now know, there are mutually exclusive ways to axiomatize mathematics! It can be diagrammed as follows (thanks to Jesse Perry for supplying the original of this chart):
(it’s a shame this chart isn’t rendering properly for some reason, since without it the rest of Eric’s quote is ~incomprehensible)
The key transactions for our purposes are C and D—the translations between a predictive model and a mathematical formalism. What mystified Einstein is how often D leads to new insights.
We begin to get some handle on the problem if we phrase it more precisely; that is, “Why does a good choice of C so often yield new knowledge via D?”
The simplest answer is to invert the question and treat it as a definition. A “good choice of C” is one which leads to new predictions. The choice of C is not one that can be made a-priori; one has to choose, empirically, a mapping between real and mathematical objects, then evaluate that mapping by seeing if it predicts well.
One can argue that it only makes sense to marvel at the utility of mathematics if one assumes that C for any phenomenal system is an a-priori given. But we’ve seen that it is not. A physicist who marvels at the applicability of mathematics has forgotten or ignored the complexity of C; he is really being puzzled at the human ability to choose appropriate mathematical models empirically.
By reformulating the question this way, we’ve slain half the dragon. Human beings are clever, persistent apes who like to play with ideas. If a mathematical formalism can be found to fit a phenomenal system, some human will eventually find it. And the discovery will come to look “inevitable” because those who tried and failed will generally be forgotten.
But there is a deeper question behind this: why do good choices of mathematical model exist at all? That is, why is there any mathematical formalism for, say, quantum mechanics which is so productive that it actually predicts the discovery of observable new particles?
The way to “answer” this question is by observing that it, too, properly serves as a kind of definition. There are many phenomenal systems for which no such exact predictive formalism has been found, nor for which one seems likely. Poets like to mumble about the human heart, but more mundane examples are available. The weather, or the behavior of any economy larger than village size, for example—systems so chaotically interdependent that exact prediction is effectively impossible (not just in fact but in principle).
There are many things for which mathematical modeling leads at best to fuzzy, contingent, statistical results and never successfully predicts ‘new entities’ at all. In fact, such systems are the rule, not the exception. So the proper answer to the question “Why is mathematics is so marvelously applicable to my science?” is simply “Because that’s the kind of science you’ve chosen to study!”
I also think I was intuition-pumped to buy Eric’s argument by Julie Moronuki’s beautiful meandering essay The Unreasonable Effectiveness of Metaphor.
Ben West’s remark in the METR blog post seems to suggest you’re right that the doubling period is shortening:
… there are reasons to think that recent trends in AI are more predictive of future performance than pre-2024 trends. As shown above, when we fit a similar trend to just the 2024 and 2025 data, this shortens the estimate of when AI can complete month-long tasks with 50% reliability by about 2.5 years.
Not if some critical paths are irreducibly serial.
What fraction of economically-valuable cognitive labor is already being automated today? How has that changed over time, especially recently?
I notice I’m confused about these ostensibly extremely basic questions, which arose in reading Open Phil’s old CCF-takeoff report, whose main metric is “time from AI that could readily[2] automate 20% of cognitive tasks to AI that could readily automate 100% of cognitive tasks”. A cursory search of Epoch’s data, Metaculus, and this forum didn’t turn up anything, but I didn’t spend much time at all doing so.
I was originally motivated by wanting to empirically understand recursive AI self-improvement better, which led to me stumbling upon the CAIS paper Examples of AI Improving AI, but I don’t have any sense whatsoever of how the paper’s 39 examples as of Oct-2023 translate to OP’s main metric even after constraining “cognitive tasks” in its operational definition to just AI R&D.
I did find this 2018 survey of expert opinion
A survey was administered to attendees of three AI conferences during the summer of 2018 (ICML, IJCAI and the HLAI conference). The survey included questions for estimating AI capabilities over the next decade, questions for forecasting five scenarios of transformative AI and questions concerning the impact of computational resources in AI research. Respondents indicated a median of 21.5% of human tasks (i.e., all tasks that humans are currently paid to do) can be feasibly automated now, and that this figure would rise to 40% in 5 years and 60% in 10 years
which would suggest that OP’s clock should’ve started ticking in 2018, so that incorporating CCF-takeoff author Tom Davidson’s “~50% to a <3 year takeoff and ~80% to <10 year i.e. time from 20%-AI to 100%-AI, for cognitive tasks in the global economy” means takeoff should’ve already occurred… so I’m dismissing this survey’s relevance to my question (sorry).
In pure math, mathematicians seek “morality”, which sounds similar to Ron’s string theory conversion stories above. Eugenia Cheng’s Mathematics, morally argues:
I claim that although proof is what supposedly establishes the undeniable truth of a piece of mathematics, proof doesn’t actually convince mathematicians of that truth. And something else does.
… formal mathematical proofs may be wonderfully watertight, but they are impossible to understand. Which is why we don’t write whole formal mathematical proofs. … Actually, when we write proofs what we have to do is convince the community that it could be turned into a formal proof. It is a highly sociological process, like appearing before a jury of twelve good men-and-true. The court, ultimately, cannot actually know if the accused actually ‘did it’ but that’s not the point; the point is to convince the jury. Like verdicts in court, our ‘sociological proofs’ can turn out to be wrong—errors are regularly found in published proofs that have been generally accepted as true. So much for mathematical proof being the source of our certainty. Mathematical proof in practice is certainly fallible.
But this isn’t the only reason that proof is unconvincing. We can read even a correct proof, and be completely convinced of the logical steps of the proof, but still not have any understanding of the whole. Like being led, step by step, through a dark forest, but having no idea of the overall route. We’ve all had the experience of reading a proof and thinking “Well, I see how each step follows from the previous one, but I don’t have a clue what’s going on!”
And yet… The mathematical community is very good at agreeing what’s true. And even if something
is accepted as true and then turns out to be untrue, people agree about that as well. Why? …Mathematical theories rarely compete at the level of truth. We don’t sit around arguing about which theory is right and which is wrong. Theories compete at some other level, with questions about what the theory “ought” to look like, what the “right” way of doing it is. It’s this other level of ‘ought’ that we call morality. … Mathematical morality is about how mathematics should behave, not just that this is right, this is wrong. Here are some examples of the sorts of sentences that involve the word “morally”, not actual
examples of moral things.“So, what’s actually going on here, morally?”
“Well, morally, this proof says...”
“Morally, this is true because...”
“Morally, there’s no reason for this axiom.”
“Morally, this question doesn’t make any sense.”
“What ought to happen here, morally?”
“This notation does work, but morally, it’s absurd!”
“Morally, this limit shouldn’t exist at all”
“Morally, there’s something higher-dimensional going on here.”Beauty/elegance is often the opposite of morality. An elegant proof is often a clever trick, a piece of magic as in Example 6 above, the sort of proof that drives you mad when you’re trying to understand something precisely because it’s so clever that it doesn’t explain anything at all.
Constructiveness is often the opposite of morality as well. If you’re proving the existence of something and you just construct it, you haven’t necessarily explained why the thing exists.
Morality doesn’t mean ‘explanatory’ either. There are so many levels of explaining something. Explanatory to whom? To someone who’s interested in moral reasons. So we haven’t really got anywhere. The same goes for intuitive, obvious, useful, natural and clear, and as Thurston says: “one person’s clear mental image is another person’s intimidation”.
Minimality/efficiency is sometimes the opposite of morality too. Sometimes the most efficient way of proving something is actually the moral way backwards. eg quadratics. And the most minimal way of presenting a theory is not necessarily the morally right way. For example, it is possible to show that a group is a set X equipped with one binary operation / satisfying the single axiom for all x, y, z ∈ X, (x/((((x/x)/y)/z)/(((x/x)/x)/z))) = y. The fact that something works is not good enough to be a moral reason.
Polya’s notion of ‘plausible reasoning’ at first sight might seem to fit the bill because it appears to be about how mathematicians decide that something is ‘plausible’ before sitting down to try and prove it. But in fact it’s somewhat probabilistic. This is not the same as a moral reason. It’s more like gathering a lot of evidence and deciding that all the evidence points to one conclusion, without there actually being a reason necessarily. Like in court, having evidence but no motive.
Abstraction perhaps gets closer to morality, along with ‘general’, ‘deep’, ‘conceptual’. But I would say that it’s the search for morality that motivates abstraction, the search for the moral reason motivates the search for greater generalities, depth and conceptual understanding. …
Proof has a sociological role; morality has a personal role. Proof is what convinces society; morality is what convinces us. Brouwer believed that a construction can never be perfectly communicated by verbal or symbolic language; rather it’s a process within the mind of an individual mathematician. What we write down is merely a language for communicating something to other mathematicians, in the hope that they will be able to reconstruct the process within their own mind. When I’m doing maths I often feel like I have to do it twice—once, morally in my head. And then once to translate it into communicable form. The translation is not a trivial process; I am going to encapsulate it as the process of moving from one form of truth to another.
Transmitting beliefs directly is unfeasible, but the question that does leap out of this is: what about the reason? Why don’t I just send the reason directly to X, thus eliminating the two probably hardest parts of this process? The answer is that a moral reason is harder to communicate than a proof. The key characteristic about proof is not its infallibility, not its ability to convince but its transferability. Proof is the best medium for communicating my argument to X in a way which will not be in danger of ambiguity, misunderstanding, or defeat. Proof is the pivot for getting from one person to another, but some translation is needed on both sides. So when I read an article, I always hope that the author will have included a reason and not just a proof, in case I can convince myself of the result without having to go to all the trouble of reading the fiddly proof.
That last part is quite reminiscent of what the late Bill Thurston argued in his classic On proof and progress in mathematics:
Mathematicians have developed habits of communication that are often dysfunctional. Organizers of colloquium talks everywhere exhort speakers to explain things in elementary terms. Nonetheless, most of the audience at an average colloquium talk gets little of value from it. Perhaps they are lost within the first 5 minutes, yet sit silently through the remaining 55 minutes. Or perhaps they quickly lose interest because the speaker plunges into technical details without presenting any reason to investigate them. At the end of the talk, the few mathematicians who are close to the field of the speaker ask a question or two to avoid embarrassment.
This pattern is similar to what often holds in classrooms, where we go through the motions of saying for the record what we think the students “ought” to learn, while the students are trying to grapple with the more fundamental issues of learning our language and guessing at our mental models. Books compensate by giving samples of how to solve every type of homework problem. Professors compensate by giving homework and tests that are much easier than the material “covered” in the course, and then grading the homework and tests on a scale that requires little understanding. We assume that the problem is with the students rather than with communication: that the students either just don’t have what it takes, or else just don’t care.
Outsiders are amazed at this phenomenon, but within the mathematical community, we dismiss it with shrugs.
Much of the difficulty has to do with the language and culture of mathematics, which is divided into subfields. Basic concepts used every day within one subfield are often foreign to another subfield. Mathematicians give up on trying to understand the basic concepts even from neighboring subfields, unless they were clued in as graduate students.
In contrast, communication works very well within the subfields of mathematics. Within a subfield, people develop a body of common knowledge and known techniques. By informal contact, people learn to understand and copy each other’s ways of thinking, so that ideas can be explained clearly and easily.
Mathematical knowledge can be transmitted amazingly fast within a subfield. When a significant theorem is proved, it often (but not always) happens that the solution can be communicated in a matter of minutes from one person to another within the subfield. The same proof would be communicated and generally understood in an hour talk to members of the subfield. It would be the subject of a 15- or 20-page paper, which could be read and understood in a few hours or perhaps days by members of the subfield.
Why is there such a big expansion from the informal discussion to the talk to the paper? One-on-one, people use wide channels of communication that go far beyond formal mathematical language. They use gestures, they draw pictures and diagrams, they make sound effects and use body language. Communication is more likely to be two-way, so that people can concentrate on what needs the most attention. With these channels of communication, they are in a much better position to convey what’s going on, not just in their logical and linguistic facilities, but in their other mental facilities as well.
In talks, people are more inhibited and more formal. Mathematical audiences are often not very good at asking the questions that are on most people’s minds, and speakers often have an unrealistic preset outline that inhibits them from addressing questions even when they are asked.
In papers, people are still more formal. Writers translate their ideas into symbols and logic, and readers try to translate back.
Why is there such a discrepancy between communication within a subfield and communication outside of subfields, not to mention communication outside mathematics? Mathematics in some sense has a common language: a language of symbols, technical definitions, computations, and logic. This language efficiently conveys some, but not all, modes of mathematical thinking. Mathematicians learn to translate certain things almost unconsciously from one mental mode to the other, so that some statements quickly become clear. Different mathematicians study papers in different ways, but when I read a mathematical paper in a field in which I’m conversant, I concentrate on the thoughts that are between the lines. I might look over several paragraphs or strings of equations and think to myself “Oh yeah, they’re putting in enough rigamarole to carry such-and-such idea.” When the idea is clear, the formal setup is usually unnecessary and redundant—I often feel that I could write it out myself more easily than figuring out what the authors actually wrote. It’s like a new toaster that comes with a 16-page manual. If you already understand toasters and if the toaster looks like previous toasters you’ve encountered, you might just plug it in and see if it works, rather than first reading all the details in the manual.
People familiar with ways of doing things in a subfield recognize various patterns of statements or formulas as idioms or circumlocution for certain concepts or mental images. But to people not already familiar with what’s going on the same patterns are not very illuminating; they are often even misleading. The language is not alive except to those who use it.
Thurston’s personal reflections below on the sociology of proof exemplify the search for mathematical morality instead of fully formally rigorous correctness. I remember being disquieted upon first reading “There were published theorems that were generally known to be false” a long time ago:
When I started as a graduate student at Berkeley, I had trouble imagining how I could “prove” a new and interesting mathematical theorem. I didn’t really understand what a “proof” was.
By going to seminars, reading papers, and talking to other graduate students, I gradually began to catch on. Within any field, there are certain theorems and certain techniques that are generally known and generally accepted. When you write a paper, you refer to these without proof. You look at other papers in the field, and you see what facts they quote without proof, and what they cite in their bibliography. You learn from other people some idea of the proofs. Then you’re free to quote the same theorem and cite the same citations. You don’t necessarily have to read the full papers or books that are in your bibliography. Many of the things that are generally known are things for which there may be no known written source. As long as people in the field are comfortable that the idea works, it doesn’t need to have a formal written source.
At first I was highly suspicious of this process. I would doubt whether a certain idea was really established. But I found that I could ask people, and they could produce explanations and proofs, or else refer me to other people or to written sources that would give explanations and proofs. There were published theorems that were generally known to be false, or where the proofs were generally known to be incomplete. Mathematical knowledge and understanding were embedded in the minds and in the social fabric of the community of people thinking about a particular topic. This knowledge was supported by written documents, but the written documents were not really primary.
I think this pattern varies quite a bit from field to field. I was interested in geometric areas of mathematics, where it is often pretty hard to have a document that reflects well the way people actually think. In more algebraic or symbolic fields, this is not necessarily so, and I have the impression that in some areas documents are much closer to carrying the life of the field. But in any field, there is a strong social standard of validity and truth. Andrew Wiles’s proof of Fermat’s Last Theorem is a good illustration of this, in a field which is very algebraic. The experts quickly came to believe that his proof was basically correct on the basis of high-level ideas, long before details could be checked. This proof will receive a great deal of scrutiny and checking compared to most mathematical proofs; but no matter how the process of verification plays out, it helps illustrate how mathematics evolves by rather organic psychological and social processes.
I chose to study physics in undergrad because I wanted to “understand the universe” and naively thought string theory was the logically correct endpoint of this pursuit, and was only saved from that fate by not being smart enough to get into a good grad school. Since then I’ve come to conclude that string theory is probably a dead end, albeit an astonishingly alluring one for a particular type of person. In that regard I find anecdotes like the following by Ron Maimon on Physics SE interesting — the reason string theorists believe isn’t the same as what they tell people, so it’s better to ask for their conversion stories:
I think that it is better to ask for a compelling argument that the physics of gravity requires a string theory completion, rather than a mathematical proof, which would be full of implicit assumptions anyway. The arguments people give in the literature are not the same as the personal reasons that they believe the theory, they are usually just stories made up to sound persuasive to students or to the general public. They fall apart under scrutiny. The real reasons take the form of a conversion story, and are much more subjective, and much less persuasive to everyone except the story teller. Still, I think that a conversion story is the only honest way to explain why you believe something that is not conclusively experimentally established.
Some famous conversion stories are:
Scherk and Schwarz (1974): They believed that the S-matrix bootstrap was a fundamental law of physics, and were persuaded that the bootstrap had a solution when they constructed proto-superstrings. An S-matrix theory doesn’t really leave room for adding new interactions, as became clear in the early seventies with the stringent string consistency conditions, so if it were a fundamental theory of strong interactions only, how would you couple it to electromagnetism or to gravity? The only way is if gravitons and photons show up as certain string modes. Scherk understood how string theory reproduces field theory, so they understood that open strings easily give gauge fields. When they and Yoneya understood that the theory requires a perturbative graviton, they realized that it couldn’t possibly be a theory of hadrons, but must include all interactions, and gravitational compactification gives meaning to the extra dimensions. Thankfully they realized this in 1974, just before S-matrix theory was banished from physics.
Ed Witten (1984): At Princeton in 1984, and everywhere along the East Coast, the Chew bootstrap was as taboo as cold fusion. The bootstrap was tautological new-agey content-free Berkeley physics, and it was justifiably dead. But once Ed Witten understood that string theory cancels gravitational anomalies, this was sufficient to convince him that it was viable. He was aware that supergravity couldn’t get chiral matter on a smooth compactification, and had a hard time fitting good grand-unification groups. Anomaly cancellation is a nontrivial constraint, it means that the theory works consistently in gravitational instantons, and it is hard to imagine a reason it should do that unless it is nonperturbatively consistent.
Everyone else (1985): once they saw Ed Witten was on board, they decided it must be right.
I am exaggerating of course. The discovery of heterotic strings and Calabi Yau compactifications was important in convincing other people that string theory was phenomenologically viable, which was important. In the Soviet Union, I am pretty sure that Knizhnik believed string theory was the theory of everything, for some deep unknown reasons, although his collaborators weren’t so sure. Polyakov liked strings because the link between the duality condition and the associativity of the OPE, which he and Kadanoff had shown should be enough to determines critical exponents in phase transitions, but I don’t think he ever fully got on board with the “theory of everything” bandwagon.
The rest of Ron’s answer elaborates on his own conversion story. The interesting part to me is that Ron began by trying to “kill string theory”, and in fact he was very happy that he was going to do so, but then was annoyed by an argument of his colleague that mathematically worked, and in the year or two he spent puzzling over why it worked he had an epiphany that convinced him string theory was correct, which sounds like nonsense to the uninitiated. (This phenomenon where people who gain understanding of the thing become incomprehensible to others sounds a lot like the discussions on LW on enlightenment by the way.)
Your second paragraph is a great point, and makes me wonder how much to adjust downward the post’s main “why care?” argument (that 1 additional point in VO2max ~ 10% lower annual all-cause mortality). It’s less clear to me how to convert marginal improvements in my sport of choice to marginal reduction in all-cause mortality though.
Some ongoing efforts to mechanize mathematical taste, described by Adam Marblestone in Automating Math:
Yoshua Bengio, one of the “fathers” of deep learning, thinks we might be able to use information theory to capture something about what makes a mathematical conjecture “interesting.” Part of the idea is that such conjectures compress large amounts of information about the body of mathematical knowledge into a small number of short, compact statements. If AI could optimize for some notion of “explanatory power” (roughly, how vast a range of disparate knowledge can be compressed into a short and simple set of axioms), this could extend the possibilities of AI for creating truly new math and would probably have wide implications beyond that of thinking about human reasoning and what creativity really is.
Others, like Gabriel Poesia at Stanford, are working to create a theorem proving system that doesn’t need to rely on bootstrapping by imitating human proofs. Instead, Poesia’s system, called Peano, has a finite set of possible actions it can take. Peano can recombine these limited available actions to generate and test a variety of theorem proving algorithms and, it is hoped, self-discover math from scratch by learning to identify patterns in its successful solutions. Finally, it can leverage its previous work by turning solutions into reusable higher-level actions called “tactics.” In Poesia’s initial paper, he shows that Peano can learn abstract rules for algebra without being explicitly taught. But there is a trade-off: Because the model does not rely on human proofs, it has to invent more from scratch and may get stuck along the way. While Poesia’s approach might lead to faster learning compared with systems like AlphaProof, it may be handicapped by starting from a more limited baseline. But the verdict is still out as to what is the best balance of these factors.
Meanwhile, the Fields Medalist Timothy Gowers is trying to develop AIs that more closely mimic the ways that human mathematicians go about proving theorems. He’s arguably in a much better position to do that than the average AI researcher given his first-hand familiarity with the process. In other words, Gowers is betting against the current paradigm of throwing huge amounts of compute at a deep learning approach and is instead aiming to use his (and his students’) ability to introspect to hard code certain algorithms into an automatic theorem proving system. In this way, it’s more similar to the previous paradigm of AI development that sought to explicitly mimic human reasoning. Here again success is far from certain, but it is another shot at the goal.
I wondered whether Gowers was simply unaware of Sutton’s bitter lesson that
… general methods that leverage computation are ultimately the most effective, and by a large margin. The ultimate reason for this is Moore’s law, or rather its generalization of continued exponentially falling cost per unit of computation. … And the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation.
which seemed unlikely given how polymathic Gowers is — and of course he’s aware:
I have written a 54-page document that explains in considerable detail what the aims and approach of the project will be. … In brief, the approach taken will be what is often referred to as a GOFAI approach… As the phrase “old-fashioned” suggests, GOFAI has fallen out of favour in recent years, and some of the reasons for that are good ones. One reason is that after initial optimism, progress with that approach stalled in many domains of AI. Another is that with the rise of machine learning it has become clear that for many tasks, especially pattern-recognition tasks, it is possible to program a computer to do them very well without having a good understanding of how humans do them. …
However, while machine learning has made huge strides in many domains, it still has several areas of weakness that are very important when one is doing mathematics. Here are a few of them.
In general, tasks that involve reasoning in an essential way.
Learning to do one task and then using that ability to do another.
Learning based on just a small number of examples.
Common sense reasoning.
Anything that involves genuine understanding (even if it may be hard to give a precise definition of what understanding is) as opposed to sophisticated mimicry.
Obviously, researchers in machine learning are working in all these areas, and there may well be progress over the next few years [in fact, there has been progress on some of these difficulties already of which I was unaware — see some of the comments below], but for the time being there are still significant limitations to what machine learning can do. (Two people who have written very interestingly on these limitations are Melanie Mitchell and François Chollet.)
That post was from April 2022, an eternity ago in AI land, and I haven’t seen any updates by him since.
There’s also Eliezer’s Arbital writeup on corporations vs superintelligences.
The short story The Epiphany of Gliese 581 by Fernando Borretti has something of the same vibe as Rajaniemi’s QT trilogy; Borretti describes it as inspired by Orion’s Arm and the works of David Zindell. Here’s a passage describing a flourishing star system already transformed by weakly posthuman tech:
The world outside Susa was a lenticular cloud of millions of lights, a galaxy in miniature, each a world unto itself. There were clusters of green lights that were comets overgrown with vacuum trees, and plant and animal and human life no Linnaeus would recognize. There were points of dull red light, the reversible computers where bodyless people lived. And there were arcs of blue that were ring habitats: ribbons tied end-to-end, holding concave ocean, and the oceans held continents, islands, mountain ranges, rivers, forests and buried ruins, endless forms of life, cities made of glass, paradise regained. All this had been inanimate dust and cratered wasteland, which human hands had made into an oasis in the sky, where quadrillions live who will never die.
The posthumans who live there called it Ctesiphon. And at times they call it paradise, after the Persian word for garden.
And at the center of the oasis there was a star that travelled backwards across the H-R diagram: already one one-hundredth of it had been whittled away; made into a necklace of artificial gas giants in preparation for the end of time; or sent through reactors where disembodied chemists made protons into carbon, oxygen, lithium and sodium, the vital construction material. And in time nothing would be left but a dim red ember encircled by cryojovian fuel depots. And the habitats would be illuminated by electric diodes.
Another star system, this time still being transformed:
Wepwawet was a dull red star, ringed by water droplets the size of mountains, where some two hundred billion people lived who breathed water. There was a planet made of stone shrouded in steam, and a train of comets, aimed by human hands from beyond the frostline, delivered constant injections of water. When the vapour condensed there would be ocean, and the shapers would get to work on the continents. Other Earths like this had been cast, like seeds, across the entire breadth of the cosmos.
The system was underpopulated: resources were abundant and people were few, and they could bask in the sun and, for a time, ignore the prophecies of Malthus, whose successors know in time there won’t be suns.
This was the first any of them had seen of nature. Not the landscaped, continent-sized gardens of Ctesiphon, where every stone had been set purposefully and after an aesthetic standard, but nature before human hands had redeemed it: an endless, sterile wasteland. The sight of scalding, airless rocks disturbed them.
Thanks, I especially appreciate that NNs playing Hex paper; Figure 8 in particular amazes me in illustrating how much more quickly perf. vs test-time compute sigmoids than I anticipated even after reading your comment. I’m guessing https://www.gwern.net/ has papers with the analogue of Fig 8 for smarter models, in which case it’s time to go rummaging around…
Matt Leifer, who works in quantum foundations, espouses a view that’s probably more extreme than Eric Raymond’s above to argue why the effectiveness of math in the natural sciences isn’t just reasonable but expected-by-construction. In his 2015 FQXi essay Mathematics is Physics Matt argued that
(Matt notes as an aside that he’s arguing for precisely the opposite of Tegmark’s MUH.)
Why “scale-free network”?
As an aside, Matt’s theory of theory-building explains (so he claims) what mathematical intuition is about: “intuition for efficient knowledge structure, rather than intuition about an abstract mathematical world”.
So what? How does this view pay rent?
Matt further develops the argument that the structure of human knowledge being networked-not-hierarchical implies that the idea that there is a most fundamental discipline, or level of reality, is mistaken in Against Fundamentalism, another FQXi essay published in 2018.