Self modelling in NN https://arxiv.org/pdf/2407.10188 Is this good news for mech interpretability? If the model makes it easily predictable, then that really seems to limit the possibilities for deceptive alignment
It makes it easier, but consider this: The human brain also does this—when we conform to expectations, we make ourselves more predictable and model ourselves. But this also doesn’t prevent deception. People still lie and some of the deception is pushed into the subconscious.
Sure it doesn’t prevent a deceptive model being made, but if AI engineers made NN with such self awareness at all levels from the ground up, that wouldn’t happen in their models. The encouraging thing if it holds up is that there is little to no “alignment tax” to make the models understandable—they are also better.
Putting down a prediction I have had for quite some time. The current LLM/Transformer architecture will stagnate before AGI/TAI (That is the ability to do any cognitive task as effectively and cheaper than a human)
From what I have seen, Tesla autopilot learns >10,000 slower than a human datawise.
We will get AGI by copying nature, at the scale of a simple mammal brain, then scaling up, like this kind of project:
That is TAI by about 2032 assuming 5 years to scan a mammal brain. In this case there could be a few years when Moores law has effectively stopped, larger data centers are not being built and it is not clear where progress will come from.
I do think there’s going to be significant AI capabilities advances from improved understanding of how mammal and bird brains work.
I disagree that more complete scanning of mammalian brains is the bottleneck. I think we actually know enough about mammalian brains and their features which are invariant across members of a species. I think the bottlenecks are:
Understanding the information we do have (scattered across terms of thousands of research papers)
Building compute efficient emulations which accurately reproduce the critical details while abstracting away the unimportant details. Since our limited understanding can’t give certain answers about which details are key, this probably involves quite a bit of parallelizable brute-forceable empirical research.
I think current LLMs can absolutely scale fast enough to be very helpful with these two tasks. So if something still seems to be missing from LLMs after the next scale-up in 2025, I expect hunting for further inspiration from the brain will seem tempting and tractable.
Thus, I think we are well on track for AGI by 2026-2028 even if LLMs don’t continue scaling.
Perhaps LLM will help with that. The reason I think that is less likely is
Deep mind etc is already heavily across biology from what I gather from interview with Demis. If the knowledge was there already there’s a good chance they would have found it
Its something specific we are after, not many small improvements, i.e. the neural code. Specifically back propagation is not how neurons learn. I’m pretty sure how they actually do is not in the literature. Attempts have been made such as the forward-forward algorithm by Hinton, but that didn’t come to anything as far as i can tell. I havn’t seen any suggestion that even with too much detail on biology we know what it is. i.e. can a very detailed neural sim with extreme processing power learn as data efficiently as biology?
If progress must come from a large jump rather than small steps, then LLM have quite a long way to go, i.e. LLM need to speed up coming up ideas as novel as the forward-forward algo to help much. If they are still below that threshold in 2026 then those possible insights are still almost entirely done by people.
Even the smartest minds in the past have been beaten by copying biology in AI. The idea for neural nets came from copying biology. (Though the transformer arch and back prop didn’t)
Deep mind etc is already heavily across biology from what I gather from interview with Demis. If the knowledge was there already there’s a good chance they would have found it
I’ve heard this viewpoint expressed before, and find it extremely confusing. I’ve been studying neuroscience and it’s implications for AI for twenty years now. I’ve read thousands of papers, including most of what DeepMind has produced. There’s still so many untested ideas because biology and the brain are so complex. Also because people tend to flock to popular paradigms, rehashing old ideas rather than testing new ones.
I’m not saying I know where the good ideas are, just that I perceive the explored portions of the Pareto frontier of plausible experiments to be extremely ragged. The are tons of places covered by “Fog of War” where good ideas could be hiding.
DeepMind is a tiny fraction of the scientists in the world that have been working on understanding and emulating the brain. Not all the scientists in the world have managed to test all the reasonable ideas, much less DeepMind alone.
Saying DeepMind has explored the implications of biology for AI is like saying that the Opportunity Rover has explored Mars. Yes, this is absolutely true, but the unexplored area vastly outweighs the explored area. If you think the statement implies “explored ALL of Mars” then you have a very inaccurate picture in mind.
OK fair point. If we are going to use analogies, then my point #2 about a specific neural code shows our different positions I think.
Lets say we are trying to get a simple aircraft of the ground and we have detailed instructions for a large passenger jet. Our problem is that the metal is too weak and cannot be used to make wings, engines etc. In that case detailed plans for aircraft are no use, a single minded focus on getting better metal is what its all about. To me the neural code is like the metal and all the neuroscience is like the plane schematics. Note that I am wary of analogies—you obviously don’t see things like that or you wouldn’t have the position you do. Analogies can explain, but rarely persuade.
A more single minded focus on the neural code would be trying to watch neural connections form in real time while learning is happening. Fixed connectome scans of say mice can somewhat help with that, more direct control of dishbrain, watching the zebra fish brain would all count, however the details of neural biology that are specific to higher mammals would be ignored.
Its possible also that there is a hybrid process, that is the AI looks at all the ideas in the literature then suggests bio experiments to get things over the line.
I think it is clear that if say you had a complete connectome scan and knew everything about how a chimp brain worked you could scale it easily to get human+ intelligence. There are no major differences. Small mammal is my best guess, mammals/birds seem to be able to learn better than say lizards. Specifically the https://en.wikipedia.org/wiki/Cortical_column is important to understand, once you fully understand one, stacking them will scale at least somewhat well.
Going even simpler, we have the connectome scan of a fly now, https://flyconnecto.me/ and that hasn’t led to major AI advances. So its somewhere between fly/chimp I’d guess mouse that gives us the missing insight to get TAI
This idea has come back up, and it could be feasible this time around because of the high launch capability and total reusability of SpaceX’s Starship. The idea is a large constellation (~30,000?) of low earth satellites that intercept nuclear launches in their boost phase where they are much slower and more vulnerable to interception. The challenge of course is that you constantly need enough satellites overhead at all times to intercept the entire arsenal of a major power if they launch all at once.
There are obvious positives and risks with this
The main positive is it removes the chance of a catastrophic nuclear war.
Negatives are potentially destabilizing the MAD status quo in the short term, and new risks such as orbital war etc.
Trying to decide if it makes nuclear war more or less likely
This firstly depends on your nuclear war yearly base rate, and projected rate into the foreseeable future.
If you think nuclear war is very unlikely then it is probably not rational to disturb the status quo, and you would reject anything potentially destabilizing like this.
However if you think that we are simply lucky and there was >50% chance of nuclear war in the last 50 years (we are on a “surviving world” from MWI etc), and while the change may be currently low, it will go up a lot again soon, then the “Pebbles” idea is worth being considered even if you think it is dangerous and destabilizing in the short term. Say it directly causes a 5% chance of war to set up, but there is a >20% chance in next 20 years without it, which it stops. In this case you could decide it is worth it as paying 5% to remove 20%.
Practical considerations
How do you set up the system to reduce the risk of it causing a nuclear war as it is setup?
Ideally you would set up the whole system in stealth before anyone was even aware of it. Disguising 30K “Pebbles” satellites as Starlink ones seems at the bounds of credibility.
You would also need to do this without properly testing a single interceptor as such a test would likely be seen and tip other nations off.
New risks
A country with such a system would have a power never seen before over others.
For example the same interceptors could shoot down civilian aircraft, shipping etc totally crippling every other country without loss of life to the attacker.
The most desirable outcome:
1. A country develops the system
2. It uses whatever means to eliminate other countries nuclear arsenals
3. It disestablishes its own nuke arsenal
4. The Pebbles system is then greatly thinned out as there is now no need to intercept ~1000 simul launches.
5. There is no catastrophic nuclear threat anymore, and no other major downside.
My guess as to how other countries would respond if it was actually deployed by the USA
Russia
I don’t think it would try to compete nor be at all likely to launch a first strike in response to USA building a system. Especially if the Ukraine war is calmed down. I don’t think it regards itself as a serious worldwide power anymore in spite of talk. Russia also isn’t worried about a first strike from the USA as it knows this is incredibly unlikely.
China
Has 200-350 nukes vs USA of ~5000
I think China would be very upset, both because it would take fewer “Pebbles” to intercept and it does see itself as a major and rising world power. I also don’t think it would launch a first strike, nor could credibly threaten to do so, and this would make it furious.
After this however it is hard to tell, e.g. China could try to destroy the satellite interceptors, say creating a debris cloud that would take many out in the lower orbits, or build thousands of decoys with no nuclear and much smaller missiles that would be hard to tell apart. Such decoys could be incapable of actually making it to orbit but still be effective. To actually reduce the nuclear threat, USA would have to commit to actually destroying China nukes on the ground. This is a pretty extreme measure and its easy to imagine USA not actually doing this leaving an unstable result.
Summary
Its unclear to me all things considered, whether attempting to deploy such a system would make things safer or more risky in total over the long term with regards to nuclear war.
Yes for sure. I don’t know how it would play out, and am skeptical anyone could. We can guess scenarios.
1. The most easily imagined one is the Pebbles owner staying in their comfort zone and not enforcing #2 at all. Something similar already happened—the USA got nukes first and let others catch up. In this case threatened nations try all sorts of things, political, commercial/trade, space war, arms race but don’t actually start a hot conflict. The Pebbles owner is left not knowing whether their system is still effective, nor the threatened countries—an unstable situation.
2. The threatened nation tries to destroy the pebbles with non-nuke means. If this was Russia, USA maybe could regenerate the system faster than Russia could destroy satellites. If its China, then lets say its not. The USA then needs to decide whether to strike the anti-satellite ground infrastructure to keep its system...
3. The threatened nation such as NK just refuses to give up nukes—in this case I can see USA destroying it.
4. India or Israel say refuses to give up their arsenal—I have no idea what would happen then.
It would certainly be nice if we could agree to all put up a ton of satellites that intercept anyone’s nuclear missiles (perhaps under the control of an international body), gradually lowering the risk across the board without massively advantaging any country. But I think it would be impossible to coordinate on this.
When I first heard and thought about AI takeoff I found the argument convincing that as soon as an AI passed IQ 100, takeoff would become hyper exponentially fast. Progress would speed up, which would then compound on itself etc. However there other possibilities.
AGI is a barrier that requires >200 IQ to pass unless we copy biology?
Progress could be discontinuous, there could be IQ thresholds required to unlock better methods or architectures. Say we fixed our current compute capability, and with fixed human intelligence we may not be able to figure out the formula for AGI, in a similar way that the combined human intelligence hasn’t cracked many hard problems even with decades and the worlds smartest minds working on them (maths problems, Quantum gravity...). This may seem unlikely for AI, but to illustrate the principle, say we only allowed IQ<90 people to work on AI. Progress would stall. So IQ <90 software developers couldn’t unlock IQ>90 AI. Can IQ 160 developers with our current compute hardware unlock >160 AI?
To me the reason we don’t have AI now is that the architecture is very data inefficient and worse at generalization than say the mammalian brain, for example a cortical column. I expect that if we knew the neural code and could copy it, then we would get at least to very high human intelligence quickly as we have the compute.
From watching AI over my career it seems to be that even the highest IQ people and groups cant make progress by themselves without data, compute and biology to copy for guidance, in contrast to other fields. For example Einstein predicted gravitational waves long before they where discovered, but Turing or Von Neumann didn’t publish the Transformer architecture or suggest backpropagation. If we did not have access to neural tissue, would we still not have artificial NN? In a related note, I think there is an XKCD cartoon that says something like the brain has to be so complex that it cannot understand itself.
(I believe now that progress in theoretical physics and pure maths is slowing to a stall as further progress requires intellectual capacity beyond the combined ability of humanity. Without AI there will be no major advances in physics anymore even with ~100 years spent on it.)
After AGI is there another threshold?
Lets say we do copy biology/solve AGI and with our current hardware can get >10,000 AGI agents with >= IQ of the smartest humans. They then optimize the code so there is 100K agents with the same resources. but then optimization stalls. The AI wouldn’t know if it was because it had optimized as much as possible, or because it lacked the ability to find a better optimization.
Does our current system scale to AGI with 1GW/1 million GPU?
Lets say we don’t copy biology, but scaling our current systems to 1GW/1 million GPU and optimizing for a few years gets us to IQ 160 at all tasks. We would have an inferior architecture compensated by a massive increase in energy/FLOPS as compared to the human brain. Progress could theoretically stall at upper level human IQ for a time rather then takeoff. (I think this isn’t very likely however) There would of course be a significant overhang where capabilities would increase suddenly when the better architecture was found and applied to the data center hosting the AI.
Related note—why 1GW data centers won’t be a consistent requirement for AI leadership.
Based on this, then a 1GW or similar data center isn’t useful or necessary for long. If it doesn’t give a significant increase in capabilities, then it won’t be cost effective. If it does, then it would optimize itself so that such power isn’t needed anymore. Only in a small range of capability increase does it actually stay around.
To me its not clear the merits of the Pause movement and training compute caps. Someone here made the case that compute caps could actually speed up AGI as people would then pay more attention to finding better architectures rather than throwing resources into scaling existing inferior ones. However all things considered I can see a lot of downsides from large data centers and little upside. I see a specific possibility where they are build, don’t give the economic justification, decrease in value a lot, then are sold to owners that are not into cutting edge AI. Then when the more efficient architecture is discovered, they are suddenly very powerful without preparation. Worldwide caps on total GPU production would also help reduce similar overhang possibilities.
Grothendieck and von Neumann were built using evolution, not deep basic science or even engineering. So in principle all that’s necessary is compute, tinkering, and evals, everything else is about shortening timelines and reducing requisite compute.
Any form of fully autonomous industry lets compute grow very quickly, in a way not constrained by human population, and only requires AI with ordinary engineering capabilities. Fusion and macroscopic biotech[1] (or nanotech) potentially get compute to grow much faster than that. To the extent human civilization would hypothetically get there in 100-1000 years without general AI, serial speedup alone might be able to get such tech via general AIs within years, even without superintelligence.
Drosophila biomass doubles every 3 days. Small things can quickly assemble into large things, transforming through metamorphosis. This is proven technology, doesn’t depend on untested ideas about what is possible like nanotech does. Industry and compute that double every 3 days can quickly eat the Solar System.
Yes the human brain was built using evolution, I have no disagreement that give us 100-1000 years with just tinkering etc we would likely get AGI. Its just that in our specific case we have bio to copy and it will get us there much faster.
Evolution is an argument that there is no barrier, even with very incompetent tinkerers that fail to figure things out (and don’t consider copying biology). So it doesn’t take an arbitrarily long time, and takes less with enough compute[1]. The 100-1000 years figure was about the fusion and macroscopic biotech milestone in the hypothetical of no general AI, which with general AI running at a higher speed becomes 0.1-10 years.
Temporarily adopting this sort of model of “AI capabilities are useful compared to human IQs”:
With IQ 100 AGI (i.e. could do about the same fraction of tasks as well as a sample of IQ 100 humans), progress may well be hyper exponentially fast: but the lead-in to a hyper-exponentially fast function could be very, very slow. The majority of even relatively incompetent humans in technical fields like AI development have greater than IQ 100. Eventually quantity may have a quality of its own, e.g. after there were very large numbers of these sub-par researcher equivalents running at faster than human and coordinated better than I would expect average humans to be.
Absent enormous numerical or speed advantages, I wouldn’t expect substantial changes in research speed until something vaguely equivalent to IQ 160 or so.
Though in practice, I’m not sure that human measures of IQ are usefully applicable to estimating rates of AI-assisted research. They are not human, and only hindsight could tell what capabilities turn out to be the most useful to advancing research. A narrow tool along the lines of AlphaFold could turn out to be radically important to research rate without having anything that you could characterize as IQ. On the other hand, it may turn out that exceeding human research capabilities isn’t practically possible from any system pretrained on material steeped in existing human paradigms and ontology.
Perhaps thinking about IQ conflates too things: correctness and speed. For individual humans, these seem correlated: people with higher IQ are usually able to get more correct results, more quickly.
But it becomes relevant when talking about groups of people: Whether a group of average people is better than a genius, depends on the nature of the task. The genius will be better at doing novel research. The group of normies will be better at doing lots of trivial paperwork.
Currently, the AIs seem comparable to having an army of normies on steroids.
The performance of a group of normies (literal or metaphorical) can sometimes be improved by error checking. For example, if you have them solve mathematical problems, they will probably make a lot of errors; adding more normies would allow you to solve more problems, but the fraction of correct solutions would remain the same. But if you give them instructions how the verify the solutions, you could increase the correctness (at a cost of slowing them down somewhat). Similarly, an LLM can give me hallucinated solutions to math / programming problems, but that is less of a concern if I can verify the solutions in Lean / using unit tests, and reject the incorrect ones; and who knows, maybe trying again will result in a better solution. (In a hypothetical extreme case, an army of monkeys with typewriters could produce Shakespeare, if we had a 100% reliable automatic verifier of their outputs.)
So it seems to me, the question is how much we can compensate for the errors caused by “lower IQ”. Depending on the answer, that’s how long we have to wait until the AIs become that intelligent.
Technology is about making boring stuff non-conscious. Beginning from basic physical movement such as making a wheel go round, to arithmetic and now code snippets that are so commonly used they shouldn’t require re-thinking. This is a reason why AI art upsets people—we actually want that to be the result of a conscious process. If you make boring stuff that creates power or wealth non-conscious then everyone is happier. Meat production would be much better if it was non-conscious. The more AI is non-conscious for a given level of capability, the better off we are.
This is a reason why AI art upsets people—we actually want that to be the result of a conscious process.
I agree with the general argument that making boring stuff non-conscious is a good thing. But in the case of art, I think the underlying problem is that people want art to be high-status.
From my perspective, the process of creating a piece of art has many steps, and some of them can legitimately be called boring. The line is not clear—the same step can be interesting when you do it for the first time, and boring when you later repeat it over and over again; or interesting when you introduce some unexpected change on purpose, and boring when you just want to do the usual. So we could use the AI to automate the steps that are no longer interesting for us, and focus on the rest. (Though some people will simply click “generate everything”.)
Consider how much time the painters living centuries ago spent preparing their colors, and learning how to prepare those colors well—and today, painters can simply buy the colors in a supermarket. But (as far as I know) no one cries that selling colors in supermarkets have ruined the art. That’s because colors are not considered mysterious anymore, and therefore they are not high-status, so no one cares whether we automate this step.
Now imagine a hypothetical painting tool that automatically fixes all your mistakes at perspective, and does nothing else. You keep painting, and whenever you complete a shape, it is magically rotated and skewed to make the perspective consistent with the rest of the picture. (Unless you want to have the shape different on purpose; in such case the tool magically understands this and leaves the shape alone. Or you simply press the “undo” button.) This would be somewhat controversial. Some people would be okay with it. There might already be a plugin in some vector editor that helps you achieve this; and if that fact becomes known, most people won’t care.
But some people would grumble that if you can’t get the perspective right on your own, perhaps you don’t deserve to be a painter! I find this intuition stronger when I imagine a literally magical tool that transforms a physical painting this way (only fixes the perspective, nothing else). The painter who uses vector graphics at least needs some computer skills to compensate for being bad at perspective, but having the perspective fixed literally auto-magically is just plain cheating.
Which suggests that an important part of our feelings about art is judging the artist’s talent and effort; assigning status to the artist… but also to ourselves as connoisseurs of the art! Some people derive a lot of pleasure from feeling superior to those who have less knowledge about art. And this is the part that might go away with AI art. (Unless we start discussing the best prompts and hyperparameters instead.)
Thanks, good detail. I am not good at traditional art, but I am interested in using maths to create a shape that is almost impossible for a traditional sculptor to create then 3d printing it.
A default position is that exponentially more processing power is needed for a constant increase in intelligence. To start, lets assume a guided/intuition + search model for intelligence. That is like Chess or Go where you have an evaluation module and a search module. In simple situations an exponential increase in processing power usually gives a linear increase in lookahead ability and rating/ELO in games measured that way.
However does this match reality?
What if the longer the time horizon, the bigger the board became, or the more complexity was introduced. For board games there is usually a constant number of possibilities to search at every ply of lookahead depth. However I think in reality that you can argue the search space should increase with time or lookahead steps. That is as you look further ahead, possibilities you didn’t have to consider before now come in the search.
For a real world example consider predicting the price of a house. As the timeframe goes from <5 years to >5 years, then there are new factors to consider e.g. changing govt policy, unexpected changes in transport patterns, (new rail nearby or in competing suburb etc), demographic changes.
In situations like these, the processing required for a constant increase in ability could go up faster than exponentially. For example looking 2 steps ahead requires 2 possibilities at each step, that is 2^2, but if its 4 steps ahead, then maybe the cost is now 3^4 as there are 3 vs 2 things to affect the result in 4 steps.
How does this affect engineering of new systems
If applies to engineering, then actual physical data will be very valuable to shrink the search space. (Well that applies if it just goes up exponentially as well) That is if you can measure the desired situation or new device state at step 10 of a 20 stage process, then you can hugely reduce the search space as you can eliminate many possibilities. Zero-shot is hard unless you can really keep the system in situations where there are no additional effects coming in.
AI models, regulations, deployments, expectations
For a simple evaluation/search model of intelligence, with just one model being used for the evaluation, improvements can be made by continually improving the evaluation model (same size better performance/same performance, smaller size). Models that produce fewer bad “candidate ideas” can be chosen, with the search itself providing feedback on what ideas had potential. In this model there is no take-off or overhang to speak of.
However I expect a TAI system to be more complicated.
I can imagine an overseer model that decides what more specialist models to use. There is a difficulty knowing what model/field of expertise to use for a given goal. Existing regulations don’t really cover these systems, the setup where you train a model, fine tune, test then release doesn’t apply strictly here. You release a set of models, and they continually improve themselves. This is a lot more like people where you continually learn.
Overhang
In this situation you get take-off or overhang where a new model architecture is introduced rather than the steady improvement from deployed systems of models. Its clear to me that the current model architectures and hence scaling laws are not near to the theoretical maximum. For example the training data needed for Tesla auto-pilot is ~10K more than what a human needs and is not superhuman. In terms of risk, its new model architectures (and evidence of very different scaling laws) rather then training FLOPS that would matter.
I think that often overlooked facet of this is that high fluid intelligence leads to higher crystallized intelligence.
I.e., the more and better you think, the more and better crystallized algorithms you can learn, and, unlike short-term benefits of fluid intelligence, long-term benefits of crystallized intelligence are compounding.
To find new better strategy linearly faster, you need exponential increase of processing power, but each found and memorized strategy saves you exponential expenditure of processing power in future.
TLDR Thinking about the Busy Beaver numbers has lead me to believe that just because a theorem holds true for a massive number of evaluated examples, this is only weak evidence it is actually true. Can we go meta on this?
“Li(𝑥) overestimates the number of primes below x more often than not, especially as x grows large. However, there are known low-lying values (like around x=10^316) discovered by Littlewood) where 𝜋(𝑥) exceeds Li(x), contradicting the general trend.”
This got me thinking about how common this kind of thing is and why? Why does a formula hold all the way up to 10^316 but then fail?
The essence of Busy Beaver numbers is that there are sequences based off of a simple formula/data that go on for a very long time and then just stop unpredictably. You can imagine replacing a simple formula with a simple theorem that appears to be true. Instead of it actually being true it is instead a way of encoding its very large counter example in a short amount of data.
If you think of it this way, a theorem that appears to be true and is evaluated over trillions of numbers is also instead a candidate to encode an exception at some very large number. In other words trillions of correct examples is only weak evidence of its correctness.
How much should we weight evaluation? We can’t evaluate to infinity and its obvious that a theorem being true to 2 million is not 2* evidence it is true at 1 million. Should we choose log(n)? A clear scale is the BB numbers themselves. e.g if your theorem is true up to BB(5) then that is 5 data points, rather than 47 million. Unlimited evaluation can never get to BB(6) so that is the limit of evidence from evaluation. (i.e. 5-6 evidence points with it being unclear how to weigh theory https://www.lesswrong.com/posts/MwQRucYo6BZZwjKE7/einstein-s-arrogance)
Now can we go meta?
Is some maths so much more powerful than others that it has equivalently greater weight as formal proof has to evaluation? Certainly some maths is more general than others. How does this effect common problems such as the Riemann Hypothesis—proving or disproving it affects a lot of maths. Showing it is correct to trillion zeros however is little evidence.
“Most mathematicians tend to believe that the Riemann Hypothesis is true, based on the weight of numerical evidence and its deep integration into existing mathematical frameworks.”
Is “deep integration” actually that deep, or is it the symbolic equivalent of evaluating up to 1 million? Perhaps just as you can find countless evaluated examples supporting a false theorem you can find much “deep integration” in favor of a famous theorem that could also be incorrect.
Further thoughts and links
Most people think P != NP, but what if P = NP where N ~ BB(10)?
Conservation of energy is a more general rule that rules out perpetual motion machines 2nd law of thermodynamics - likewise, HOWEVER that law must have been broken somehow to get a low entropy initial state for the Big Bang.
AI examples 1 The Polya Conjecture Proposed by George Pólya in 1919, this conjecture related to the distribution of prime numbers. It posited that for any number 𝑥, the majority of the numbers less than 𝑥 have an odd number of prime factors. It was verified for numbers up to 1,500,000, but a counterexample was found when x was around 906 million. This shows a fascinating case where numerical verification up to a large number was still not sufficient.
2 Mertens Conjecture The Mertens conjecture suggested that the absolute value of the Mertens function M(x) is always less than sqrt(x) This was proven false by computational means with a counterexample found above 10e14 by Andrew Odlyzko and Herman te Riele in 1985.
Unlimited evaluation can never get to BB(6) so that is the limit of evidence from evaluation.
The value of BB(6) is not currently known, but it could in principle be discovered. There is no general algorithm for calculating BB numbers, but any particular BB(n) could be determined by enumerating all n-state Turing machines and proving whether each one halts.
According to Scott, “Pavel Kropitz discovered, a couple years ago, that BB(6) is at least 10^10^10^10^10^10^10^10^10^10^10^10^10^10^10 (i.e., 10 raised to itself 15 times).”
So we can never evaluate BB(6) as it is at least this large
Could this be cheaper than chips in an extreme silicon shortage? How did it learn, can we map connections forming and make better learning algorithms.
Birds vs ants/bees.
A flock of birds can be dumber than the dumbest individual bird, a colony of bees/ants can be smarter than than the individual, and smarter than a flock of birds! Bird avoiding predator in geometrical pattern—no intelligence as predictability like fluid has no processing. Vs bees swarming the scout hornet or ants building a bridge etc. Even though no planning in ants, no overall plan in individual neurons?
The more complex pieces the less well they fit together. Less intelligent units can form a better collective in this instance. Not like human orgs.
Progression from simple cell to mitochondria—mito have no say anymore but fit in perfectly. Multi organism like hive are next level up—simpler creatures can have more cohesion in upper level. Humans have more effective institutions in spite of complexity b/c of consciousness, language etc.
RISC vs CISC Intel vs NVIDIA, GPU for super computers. I though about this years ago, led to prediction that Intel or other CISC max business would lose to cheaper.
Time to communicate a positive singularity/utopia
Spheres of influence, like we already have, uncontacted tribes, Amish etc. Taking that further, Super AI must leave earth, perhaps solar system, enhanced ppl to of earth eco-system, space colonies, or Mars etc.
Take the best/happy nature to expand, don’t take suffering to >million stars.
Humans can’t do interstellar faster than AI anyway even if that was the goal, it would have to prepare it first, and can travel faster. So no question majority of humanity interstellar is AI. Need to keep earth for people. What is max CEV? Well keep earth ecosystem, humans can progress, discover on their own?
Is the progression to go outwards, human, posthuman/Neuralink, WBE? it is is some sci-fi Peter Hamilton/ Culture (human to WBE)
Long term all moral systems don’t know what to say on pleasure vs self determination/achievement. Eventually we run out of inventing things—should it go asymptotically slower.
Explores should be on the edge of civilization. For astronomers, shouldn’t celebrate JWST, but complain about Starlink—that is inconsistent. Edge of civilization has expanded past low earth orbit, that is why we get JWST. Obligation then to put telescopes further out.
Go to WBE instead of super AI—know for sure it is conscious.
Is industry, tech about making stuff less conscious with time? e.g. mechanical things have zero, vs a lot when done by people. Is that a principle for AI/robots? then there are no slaves etc.
Can ppl get behind this? - implied contract with future AI? acausal bargaining.
For search, exponential processing power gives linear increate in rating, Chess, Go. However this is a small search space. For life, does the search get bigger the further out you go.
e.g. 2 steps is 2^2 but 4 steps is 4^4. This makes sense if there are more things to consider the further ahead you look. e.g. house price for 1 month, general market, + economic trend. 10+ years then demographic trends, changing govt policy, unexpected changes in transport patterns, (new rail nearby or in competing suburb etc)
If applies to tech, then regular experiments shrink the search space, need physical experimentation to get ahead.
For AI, if its like intuition/search then need search to improve intuition. Can only learn from long term.
Long pause or not?
How long should we pause? 10 years? Even in stable society there is diminishing returns—seen this with pure maths, physics, philosophy, when we reach human limits, then more time simply doesn’t help. Reasonable to assume with CEV like concept also.
Pause carries danger? Is it like the clear pond before a rapid, are we already in the rapid, then trying to stop is dangerous having baby is fatal etc. “Emmett Shear” of go fast slow, stop, pause, Singularity seems ideal, though possible? WBE better than super AI—cultural as elder?
1984 quote “If you want a vision of the future, imagine a boot stamping on a human face—forever.”
“Heaven is high and the emperor is far away” is a Chinese proverb thought to have originated from Zhejiang during the Yuan dynasty.
Not possible earlier but is possible now. If democracies go to dictatorship but not back then pause is bad. Best way to keep democracies is to leave hence space colonies. Now in Xinjiang, the emperor is in your pocket, LLM can understand anything—how far back to go before this is not possible? 20 years, if not possible, then we are in the white water, and we need to paddle forwards, can’t stop.
Deep time breaks all common ethics?
Utility monster, experience machine, moral realism tiling the universe etc. Self determination and achievement will be in the extreme minority over many years. What to do, fake it forget it and keep achieving again? Just keep options open until we actually experience it.
All our training is about intrinsic motivation and valuing achievement rather than pleasure for its own sake. Great asymmetry in common thought “meaningless pleasure” makes sense and seems bad or not good, but “meaningless pain” doesn’t make it less bad. Why should that be the case. Evolution has biased us to not value pleasure or experience it as much as we “should”? Learn to take pleasure regard thinking “meaningless pleasure” is itself a defective attitude? If you could change yourself, should you dial down the need to achieve if you lived in a solved world?
What is “should” in is-ought. Moral realism in the limit? “Should” is us not trusting our reason, as we shouldn’t. If reason says one thing, then it could be flawed as it is in most cases. Especially as we evolved, then if we always trusted it, then mistakes are bigger than benefits, so the feeling “you don’t do what you should” is two systems competing, intuition/history vs new rational.
If so it seems to be getting a lot less attention compared to its compute capability.
Not sure if I have stated this before clearly but I believe scaling laws will not hold for LLM/Transformer type tech, and at least one major architectural advance is missing before AGI. That is increasing scaling of compute and data will plateau performance soon, and before AGI. Therefore I expect to see evidence for this not much after the end of this year, when large training runs yield models that are a lot more expensive to train, slower on inference and only a little better on performance. X.AI could be one of the first to publicly let this be known (Open AI, etc could very well be aware of this but not making it public)
Completion of the 100K H100s cluster seems to mean Grok-3 won’t be trained only on a smaller part of it, so it must be targeting all of it. But also Musk said Grok-3 is planned for end of 2024. So it won’t get more than about 2.7e26 FLOPs, about 14x GPT-4 (the training that started end of July could have just used a larger mini-batch size that anticipates the data parallelism needs of the larger cluster, so the same run could continue all the way from July to November). With 6 months of training on the whole cluster, it could instead get up to 5e26 FLOPs (25x GPT-4), but that needs to wait for another run.
On the other hand, with about 20K H100s, which is the scale that was offered at AWS in July 2023 and might’ve been available at Microsoft internally even earlier, it only takes 5 months to get 1e26 FLOPs. So GPT-4o might already be a 5x GPT-4 model. But it also could be an overtrained model (to get better inference efficiency), so not expected to be fundamentally much smarter.
Google has very large datacenters, if measured in megawatts, but they are filled with older TPUs. Maybe they are fine compared to H100s on FLOP/joule basis though? In BF16, A100 (0.3e15 FLOP/s, 400W) to H100 (1e15 FLOP/s, 700W) to B100 (1.8e15 FLOP/s, 700W) notably improve FLOP/joule, but for recent TPUs TDP is not disclosed (and the corresponding fraction of the rest of the datacenter needs to be taken into account, for example it turns 700W of an H100 into about 1500W). In terms of FLOPS/GPU, only the latest generation announced in May 2024 matches H100s, it might take time to install enough of them.
They seem to have big plans for next year, but possibly they are not yet quite ready to be significantly ahead of 100K H100s clusters.
This piece relates to this manifold market and these videos
I listened to most of the 17+ hours of the debate and found it mostly interesting, informative and important for someone either interested in COVID origins or practicing rationality.
I came into this debate about 65-80% lab leak, and left feeling <10% is most likely.
Key takeaways
The big picture of the lab leak is easy to understand and sounds convincing, however the details don’t check out when put under scrutiny.
Both sides attempted Bayesian estimates and probabilities and got absolutely absurd differences in estimates.
Rootclaim failed to impress me—the takeaway I got is that they are well suited to say murder cases where there is history to go off, but when it comes to such a large messy, one-off event as COVID origins they didn’t know what evidence to include, how to properly weight it etc. They didn’t present a coherent picture of why we should accept their worldview and estimates. An example is where they asserted that even if Zoonosis was the origin then the claimed market was not the origin because the details of infected animals and humans wasn’t what they expected. This seems an absurd claim to make with confidence judging on the data available. When forced to build models (rather than rely on multiplying probabilities) they were bad at it and overconfident in their conclusions from such models.
More generally this led me to distrust Bayesian inference type methods in complicated situations. Two smart reasonably well prepared positions could be off by say >1e12 in relative estimates. Getting all the details right, building consistent models that are peer reviewed by experts cannot be made up for by giving uncertainties to things.
Regarding AI, I have now more sympathy to the claim that P(Doom) is a measure of how the individual feels, rather than a defensible position on what the odds actually are.
Self modelling in NN https://arxiv.org/pdf/2407.10188 Is this good news for mech interpretability? If the model makes it easily predictable, then that really seems to limit the possibilities for deceptive alignment
It makes it easier, but consider this: The human brain also does this—when we conform to expectations, we make ourselves more predictable and model ourselves. But this also doesn’t prevent deception. People still lie and some of the deception is pushed into the subconscious.
Sure it doesn’t prevent a deceptive model being made, but if AI engineers made NN with such self awareness at all levels from the ground up, that wouldn’t happen in their models. The encouraging thing if it holds up is that there is little to no “alignment tax” to make the models understandable—they are also better.
Indeed, engineering readability at multiple levels may solve this.
Putting down a prediction I have had for quite some time.
The current LLM/Transformer architecture will stagnate before AGI/TAI (That is the ability to do any cognitive task as effectively and cheaper than a human)
From what I have seen, Tesla autopilot learns >10,000 slower than a human datawise.
We will get AGI by copying nature, at the scale of a simple mammal brain, then scaling up, like this kind of project:
https://x.com/Andrew_C_Payne/status/1863957226010144791
https://e11.bio/news/roadmap
I expect AGI to be 0-2 years after a mammal brain is mapped. In terms of cost-effectiveness I consider such a connectome project to be far more cost effective per $ than large training runs or building a 1GW data center etc if you goal is to achieve AGI.
That is TAI by about 2032 assuming 5 years to scan a mammal brain. In this case there could be a few years when Moores law has effectively stopped, larger data centers are not being built and it is not clear where progress will come from.
I do think there’s going to be significant AI capabilities advances from improved understanding of how mammal and bird brains work. I disagree that more complete scanning of mammalian brains is the bottleneck. I think we actually know enough about mammalian brains and their features which are invariant across members of a species. I think the bottlenecks are: Understanding the information we do have (scattered across terms of thousands of research papers) Building compute efficient emulations which accurately reproduce the critical details while abstracting away the unimportant details. Since our limited understanding can’t give certain answers about which details are key, this probably involves quite a bit of parallelizable brute-forceable empirical research.
I think current LLMs can absolutely scale fast enough to be very helpful with these two tasks. So if something still seems to be missing from LLMs after the next scale-up in 2025, I expect hunting for further inspiration from the brain will seem tempting and tractable. Thus, I think we are well on track for AGI by 2026-2028 even if LLMs don’t continue scaling.
Perhaps LLM will help with that. The reason I think that is less likely is
Deep mind etc is already heavily across biology from what I gather from interview with Demis. If the knowledge was there already there’s a good chance they would have found it
Its something specific we are after, not many small improvements, i.e. the neural code. Specifically back propagation is not how neurons learn. I’m pretty sure how they actually do is not in the literature. Attempts have been made such as the forward-forward algorithm by Hinton, but that didn’t come to anything as far as i can tell. I havn’t seen any suggestion that even with too much detail on biology we know what it is. i.e. can a very detailed neural sim with extreme processing power learn as data efficiently as biology?
If progress must come from a large jump rather than small steps, then LLM have quite a long way to go, i.e. LLM need to speed up coming up ideas as novel as the forward-forward algo to help much. If they are still below that threshold in 2026 then those possible insights are still almost entirely done by people.
Even the smartest minds in the past have been beaten by copying biology in AI. The idea for neural nets came from copying biology. (Though the transformer arch and back prop didn’t)
I’ve heard this viewpoint expressed before, and find it extremely confusing. I’ve been studying neuroscience and it’s implications for AI for twenty years now. I’ve read thousands of papers, including most of what DeepMind has produced. There’s still so many untested ideas because biology and the brain are so complex. Also because people tend to flock to popular paradigms, rehashing old ideas rather than testing new ones.
I’m not saying I know where the good ideas are, just that I perceive the explored portions of the Pareto frontier of plausible experiments to be extremely ragged. The are tons of places covered by “Fog of War” where good ideas could be hiding.
DeepMind is a tiny fraction of the scientists in the world that have been working on understanding and emulating the brain. Not all the scientists in the world have managed to test all the reasonable ideas, much less DeepMind alone.
Saying DeepMind has explored the implications of biology for AI is like saying that the Opportunity Rover has explored Mars. Yes, this is absolutely true, but the unexplored area vastly outweighs the explored area. If you think the statement implies “explored ALL of Mars” then you have a very inaccurate picture in mind.
OK fair point. If we are going to use analogies, then my point #2 about a specific neural code shows our different positions I think.
Lets say we are trying to get a simple aircraft of the ground and we have detailed instructions for a large passenger jet. Our problem is that the metal is too weak and cannot be used to make wings, engines etc. In that case detailed plans for aircraft are no use, a single minded focus on getting better metal is what its all about. To me the neural code is like the metal and all the neuroscience is like the plane schematics. Note that I am wary of analogies—you obviously don’t see things like that or you wouldn’t have the position you do. Analogies can explain, but rarely persuade.
A more single minded focus on the neural code would be trying to watch neural connections form in real time while learning is happening. Fixed connectome scans of say mice can somewhat help with that, more direct control of dishbrain, watching the zebra fish brain would all count, however the details of neural biology that are specific to higher mammals would be ignored.
Its possible also that there is a hybrid process, that is the AI looks at all the ideas in the literature then suggests bio experiments to get things over the line.
Can you explain more about why you think [AGI requires] a shared feature of mammals and not, say, humans or other particular species?
I think it is clear that if say you had a complete connectome scan and knew everything about how a chimp brain worked you could scale it easily to get human+ intelligence. There are no major differences. Small mammal is my best guess, mammals/birds seem to be able to learn better than say lizards. Specifically the https://en.wikipedia.org/wiki/Cortical_column is important to understand, once you fully understand one, stacking them will scale at least somewhat well.
Going to smaller scales/numbers of neurons, it may not need to be as much as a mammal, https://cosmosmagazine.com/technology/dishbrain-pong-brain-on-chip-startup/, perhaps we can learn enough of the secrets here? I expect not, but only weakly confident.
Going even simpler, we have the connectome scan of a fly now, https://flyconnecto.me/ and that hasn’t led to major AI advances. So its somewhere between fly/chimp I’d guess mouse that gives us the missing insight to get TAI
Brilliant Pebbles?
See here and here
This idea has come back up, and it could be feasible this time around because of the high launch capability and total reusability of SpaceX’s Starship. The idea is a large constellation (~30,000?) of low earth satellites that intercept nuclear launches in their boost phase where they are much slower and more vulnerable to interception. The challenge of course is that you constantly need enough satellites overhead at all times to intercept the entire arsenal of a major power if they launch all at once.
There are obvious positives and risks with this
The main positive is it removes the chance of a catastrophic nuclear war.
Negatives are potentially destabilizing the MAD status quo in the short term, and new risks such as orbital war etc.
Trying to decide if it makes nuclear war more or less likely
This firstly depends on your nuclear war yearly base rate, and projected rate into the foreseeable future.
If you think nuclear war is very unlikely then it is probably not rational to disturb the status quo, and you would reject anything potentially destabilizing like this.
However if you think that we are simply lucky and there was >50% chance of nuclear war in the last 50 years (we are on a “surviving world” from MWI etc), and while the change may be currently low, it will go up a lot again soon, then the “Pebbles” idea is worth being considered even if you think it is dangerous and destabilizing in the short term. Say it directly causes a 5% chance of war to set up, but there is a >20% chance in next 20 years without it, which it stops. In this case you could decide it is worth it as paying 5% to remove 20%.
Practical considerations
How do you set up the system to reduce the risk of it causing a nuclear war as it is setup?
Ideally you would set up the whole system in stealth before anyone was even aware of it. Disguising 30K “Pebbles” satellites as Starlink ones seems at the bounds of credibility.
You would also need to do this without properly testing a single interceptor as such a test would likely be seen and tip other nations off.
New risks
A country with such a system would have a power never seen before over others.
For example the same interceptors could shoot down civilian aircraft, shipping etc totally crippling every other country without loss of life to the attacker.
The most desirable outcome:
1. A country develops the system
2. It uses whatever means to eliminate other countries nuclear arsenals
3. It disestablishes its own nuke arsenal
4. The Pebbles system is then greatly thinned out as there is now no need to intercept ~1000 simul launches.
5. There is no catastrophic nuclear threat anymore, and no other major downside.
My guess as to how other countries would respond if it was actually deployed by the USA
Russia
I don’t think it would try to compete nor be at all likely to launch a first strike in response to USA building a system. Especially if the Ukraine war is calmed down. I don’t think it regards itself as a serious worldwide power anymore in spite of talk. Russia also isn’t worried about a first strike from the USA as it knows this is incredibly unlikely.
China
Has 200-350 nukes vs USA of ~5000
I think China would be very upset, both because it would take fewer “Pebbles” to intercept and it does see itself as a major and rising world power. I also don’t think it would launch a first strike, nor could credibly threaten to do so, and this would make it furious.
After this however it is hard to tell, e.g. China could try to destroy the satellite interceptors, say creating a debris cloud that would take many out in the lower orbits, or build thousands of decoys with no nuclear and much smaller missiles that would be hard to tell apart. Such decoys could be incapable of actually making it to orbit but still be effective. To actually reduce the nuclear threat, USA would have to commit to actually destroying China nukes on the ground. This is a pretty extreme measure and its easy to imagine USA not actually doing this leaving an unstable result.
Summary
Its unclear to me all things considered, whether attempting to deploy such a system would make things safer or more risky in total over the long term with regards to nuclear war.
I think the part where other nations just roll with this is underexplained.
Yes for sure. I don’t know how it would play out, and am skeptical anyone could. We can guess scenarios.
1. The most easily imagined one is the Pebbles owner staying in their comfort zone and not enforcing #2 at all. Something similar already happened—the USA got nukes first and let others catch up. In this case threatened nations try all sorts of things, political, commercial/trade, space war, arms race but don’t actually start a hot conflict. The Pebbles owner is left not knowing whether their system is still effective, nor the threatened countries—an unstable situation.
2. The threatened nation tries to destroy the pebbles with non-nuke means. If this was Russia, USA maybe could regenerate the system faster than Russia could destroy satellites. If its China, then lets say its not. The USA then needs to decide whether to strike the anti-satellite ground infrastructure to keep its system...
3. The threatened nation such as NK just refuses to give up nukes—in this case I can see USA destroying it.
4. India or Israel say refuses to give up their arsenal—I have no idea what would happen then.
It would certainly be nice if we could agree to all put up a ton of satellites that intercept anyone’s nuclear missiles (perhaps under the control of an international body), gradually lowering the risk across the board without massively advantaging any country. But I think it would be impossible to coordinate on this.
Types of takeoff
When I first heard and thought about AI takeoff I found the argument convincing that as soon as an AI passed IQ 100, takeoff would become hyper exponentially fast. Progress would speed up, which would then compound on itself etc. However there other possibilities.
AGI is a barrier that requires >200 IQ to pass unless we copy biology?
Progress could be discontinuous, there could be IQ thresholds required to unlock better methods or architectures. Say we fixed our current compute capability, and with fixed human intelligence we may not be able to figure out the formula for AGI, in a similar way that the combined human intelligence hasn’t cracked many hard problems even with decades and the worlds smartest minds working on them (maths problems, Quantum gravity...). This may seem unlikely for AI, but to illustrate the principle, say we only allowed IQ<90 people to work on AI. Progress would stall. So IQ <90 software developers couldn’t unlock IQ>90 AI. Can IQ 160 developers with our current compute hardware unlock >160 AI?
To me the reason we don’t have AI now is that the architecture is very data inefficient and worse at generalization than say the mammalian brain, for example a cortical column. I expect that if we knew the neural code and could copy it, then we would get at least to very high human intelligence quickly as we have the compute.
From watching AI over my career it seems to be that even the highest IQ people and groups cant make progress by themselves without data, compute and biology to copy for guidance, in contrast to other fields. For example Einstein predicted gravitational waves long before they where discovered, but Turing or Von Neumann didn’t publish the Transformer architecture or suggest backpropagation. If we did not have access to neural tissue, would we still not have artificial NN? In a related note, I think there is an XKCD cartoon that says something like the brain has to be so complex that it cannot understand itself.
(I believe now that progress in theoretical physics and pure maths is slowing to a stall as further progress requires intellectual capacity beyond the combined ability of humanity. Without AI there will be no major advances in physics anymore even with ~100 years spent on it.)
After AGI is there another threshold?
Lets say we do copy biology/solve AGI and with our current hardware can get >10,000 AGI agents with >= IQ of the smartest humans. They then optimize the code so there is 100K agents with the same resources. but then optimization stalls. The AI wouldn’t know if it was because it had optimized as much as possible, or because it lacked the ability to find a better optimization.
Does our current system scale to AGI with 1GW/1 million GPU?
Lets say we don’t copy biology, but scaling our current systems to 1GW/1 million GPU and optimizing for a few years gets us to IQ 160 at all tasks. We would have an inferior architecture compensated by a massive increase in energy/FLOPS as compared to the human brain. Progress could theoretically stall at upper level human IQ for a time rather then takeoff. (I think this isn’t very likely however) There would of course be a significant overhang where capabilities would increase suddenly when the better architecture was found and applied to the data center hosting the AI.
Related note—why 1GW data centers won’t be a consistent requirement for AI leadership.
Based on this, then a 1GW or similar data center isn’t useful or necessary for long. If it doesn’t give a significant increase in capabilities, then it won’t be cost effective. If it does, then it would optimize itself so that such power isn’t needed anymore. Only in a small range of capability increase does it actually stay around.
To me its not clear the merits of the Pause movement and training compute caps. Someone here made the case that compute caps could actually speed up AGI as people would then pay more attention to finding better architectures rather than throwing resources into scaling existing inferior ones. However all things considered I can see a lot of downsides from large data centers and little upside. I see a specific possibility where they are build, don’t give the economic justification, decrease in value a lot, then are sold to owners that are not into cutting edge AI. Then when the more efficient architecture is discovered, they are suddenly very powerful without preparation. Worldwide caps on total GPU production would also help reduce similar overhang possibilities.
Grothendieck and von Neumann were built using evolution, not deep basic science or even engineering. So in principle all that’s necessary is compute, tinkering, and evals, everything else is about shortening timelines and reducing requisite compute.
Any form of fully autonomous industry lets compute grow very quickly, in a way not constrained by human population, and only requires AI with ordinary engineering capabilities. Fusion and macroscopic biotech[1] (or nanotech) potentially get compute to grow much faster than that. To the extent human civilization would hypothetically get there in 100-1000 years without general AI, serial speedup alone might be able to get such tech via general AIs within years, even without superintelligence.
Drosophila biomass doubles every 3 days. Small things can quickly assemble into large things, transforming through metamorphosis. This is proven technology, doesn’t depend on untested ideas about what is possible like nanotech does. Industry and compute that double every 3 days can quickly eat the Solar System.
Yes the human brain was built using evolution, I have no disagreement that give us 100-1000 years with just tinkering etc we would likely get AGI. Its just that in our specific case we have bio to copy and it will get us there much faster.
Evolution is an argument that there is no barrier, even with very incompetent tinkerers that fail to figure things out (and don’t consider copying biology). So it doesn’t take an arbitrarily long time, and takes less with enough compute[1]. The 100-1000 years figure was about the fusion and macroscopic biotech milestone in the hypothetical of no general AI, which with general AI running at a higher speed becomes 0.1-10 years.
Moore’s Law of Mad Science: Every 18 months, the minimum IQ to destroy the world drops by one point.
Temporarily adopting this sort of model of “AI capabilities are useful compared to human IQs”:
With IQ 100 AGI (i.e. could do about the same fraction of tasks as well as a sample of IQ 100 humans), progress may well be hyper exponentially fast: but the lead-in to a hyper-exponentially fast function could be very, very slow. The majority of even relatively incompetent humans in technical fields like AI development have greater than IQ 100. Eventually quantity may have a quality of its own, e.g. after there were very large numbers of these sub-par researcher equivalents running at faster than human and coordinated better than I would expect average humans to be.
Absent enormous numerical or speed advantages, I wouldn’t expect substantial changes in research speed until something vaguely equivalent to IQ 160 or so.
Though in practice, I’m not sure that human measures of IQ are usefully applicable to estimating rates of AI-assisted research. They are not human, and only hindsight could tell what capabilities turn out to be the most useful to advancing research. A narrow tool along the lines of AlphaFold could turn out to be radically important to research rate without having anything that you could characterize as IQ. On the other hand, it may turn out that exceeding human research capabilities isn’t practically possible from any system pretrained on material steeped in existing human paradigms and ontology.
Perhaps thinking about IQ conflates too things: correctness and speed. For individual humans, these seem correlated: people with higher IQ are usually able to get more correct results, more quickly.
But it becomes relevant when talking about groups of people: Whether a group of average people is better than a genius, depends on the nature of the task. The genius will be better at doing novel research. The group of normies will be better at doing lots of trivial paperwork.
Currently, the AIs seem comparable to having an army of normies on steroids.
The performance of a group of normies (literal or metaphorical) can sometimes be improved by error checking. For example, if you have them solve mathematical problems, they will probably make a lot of errors; adding more normies would allow you to solve more problems, but the fraction of correct solutions would remain the same. But if you give them instructions how the verify the solutions, you could increase the correctness (at a cost of slowing them down somewhat). Similarly, an LLM can give me hallucinated solutions to math / programming problems, but that is less of a concern if I can verify the solutions in Lean / using unit tests, and reject the incorrect ones; and who knows, maybe trying again will result in a better solution. (In a hypothetical extreme case, an army of monkeys with typewriters could produce Shakespeare, if we had a 100% reliable automatic verifier of their outputs.)
So it seems to me, the question is how much we can compensate for the errors caused by “lower IQ”. Depending on the answer, that’s how long we have to wait until the AIs become that intelligent.
In fact, speed and accuracy in humans are at least somewhat mechanistically different
Technology is about making boring stuff non-conscious. Beginning from basic physical movement such as making a wheel go round, to arithmetic and now code snippets that are so commonly used they shouldn’t require re-thinking. This is a reason why AI art upsets people—we actually want that to be the result of a conscious process. If you make boring stuff that creates power or wealth non-conscious then everyone is happier. Meat production would be much better if it was non-conscious. The more AI is non-conscious for a given level of capability, the better off we are.
I agree with the general argument that making boring stuff non-conscious is a good thing. But in the case of art, I think the underlying problem is that people want art to be high-status.
From my perspective, the process of creating a piece of art has many steps, and some of them can legitimately be called boring. The line is not clear—the same step can be interesting when you do it for the first time, and boring when you later repeat it over and over again; or interesting when you introduce some unexpected change on purpose, and boring when you just want to do the usual. So we could use the AI to automate the steps that are no longer interesting for us, and focus on the rest. (Though some people will simply click “generate everything”.)
Consider how much time the painters living centuries ago spent preparing their colors, and learning how to prepare those colors well—and today, painters can simply buy the colors in a supermarket. But (as far as I know) no one cries that selling colors in supermarkets have ruined the art. That’s because colors are not considered mysterious anymore, and therefore they are not high-status, so no one cares whether we automate this step.
Now imagine a hypothetical painting tool that automatically fixes all your mistakes at perspective, and does nothing else. You keep painting, and whenever you complete a shape, it is magically rotated and skewed to make the perspective consistent with the rest of the picture. (Unless you want to have the shape different on purpose; in such case the tool magically understands this and leaves the shape alone. Or you simply press the “undo” button.) This would be somewhat controversial. Some people would be okay with it. There might already be a plugin in some vector editor that helps you achieve this; and if that fact becomes known, most people won’t care.
But some people would grumble that if you can’t get the perspective right on your own, perhaps you don’t deserve to be a painter! I find this intuition stronger when I imagine a literally magical tool that transforms a physical painting this way (only fixes the perspective, nothing else). The painter who uses vector graphics at least needs some computer skills to compensate for being bad at perspective, but having the perspective fixed literally auto-magically is just plain cheating.
Which suggests that an important part of our feelings about art is judging the artist’s talent and effort; assigning status to the artist… but also to ourselves as connoisseurs of the art! Some people derive a lot of pleasure from feeling superior to those who have less knowledge about art. And this is the part that might go away with AI art. (Unless we start discussing the best prompts and hyperparameters instead.)
Thanks, good detail. I am not good at traditional art, but I am interested in using maths to create a shape that is almost impossible for a traditional sculptor to create then 3d printing it.
How does intelligence scale with processing power
A default position is that exponentially more processing power is needed for a constant increase in intelligence.
To start, lets assume a guided/intuition + search model for intelligence. That is like Chess or Go where you have an evaluation module and a search module. In simple situations an exponential increase in processing power usually gives a linear increase in lookahead ability and rating/ELO in games measured that way.
However does this match reality?
What if the longer the time horizon, the bigger the board became, or the more complexity was introduced. For board games there is usually a constant number of possibilities to search at every ply of lookahead depth. However I think in reality that you can argue the search space should increase with time or lookahead steps. That is as you look further ahead, possibilities you didn’t have to consider before now come in the search.
For a real world example consider predicting the price of a house. As the timeframe goes from <5 years to >5 years, then there are new factors to consider e.g. changing govt policy, unexpected changes in transport patterns, (new rail nearby or in competing suburb etc), demographic changes.
In situations like these, the processing required for a constant increase in ability could go up faster than exponentially. For example looking 2 steps ahead requires 2 possibilities at each step, that is 2^2, but if its 4 steps ahead, then maybe the cost is now 3^4 as there are 3 vs 2 things to affect the result in 4 steps.
How does this affect engineering of new systems
If applies to engineering, then actual physical data will be very valuable to shrink the search space. (Well that applies if it just goes up exponentially as well) That is if you can measure the desired situation or new device state at step 10 of a 20 stage process, then you can hugely reduce the search space as you can eliminate many possibilities. Zero-shot is hard unless you can really keep the system in situations where there are no additional effects coming in.
AI models, regulations, deployments, expectations
For a simple evaluation/search model of intelligence, with just one model being used for the evaluation, improvements can be made by continually improving the evaluation model (same size better performance/same performance, smaller size). Models that produce fewer bad “candidate ideas” can be chosen, with the search itself providing feedback on what ideas had potential. In this model there is no take-off or overhang to speak of.
However I expect a TAI system to be more complicated.
I can imagine an overseer model that decides what more specialist models to use. There is a difficulty knowing what model/field of expertise to use for a given goal. Existing regulations don’t really cover these systems, the setup where you train a model, fine tune, test then release doesn’t apply strictly here. You release a set of models, and they continually improve themselves. This is a lot more like people where you continually learn.
Overhang
In this situation you get take-off or overhang where a new model architecture is introduced rather than the steady improvement from deployed systems of models. Its clear to me that the current model architectures and hence scaling laws are not near to the theoretical maximum. For example the training data needed for Tesla auto-pilot is ~10K more than what a human needs and is not superhuman. In terms of risk, its new model architectures (and evidence of very different scaling laws) rather then training FLOPS that would matter.
I think that often overlooked facet of this is that high fluid intelligence leads to higher crystallized intelligence.
I.e., the more and better you think, the more and better crystallized algorithms you can learn, and, unlike short-term benefits of fluid intelligence, long-term benefits of crystallized intelligence are compounding.
To find new better strategy linearly faster, you need exponential increase of processing power, but each found and memorized strategy saves you exponential expenditure of processing power in future.
Evaluation vs Symbolism
TLDR
Thinking about the Busy Beaver numbers has lead me to believe that just because a theorem holds true for a massive number of evaluated examples, this is only weak evidence it is actually true. Can we go meta on this?
Main
After reading a post by Scott Aaronson, and this coming to my attention https://en.wikipedia.org/wiki/Prime_number_theorem and Littlewood’s theorem
“Li(𝑥) overestimates the number of primes below x more often than not, especially as x grows large. However, there are known low-lying values (like around x=10^316) discovered by Littlewood) where 𝜋(𝑥) exceeds Li(x), contradicting the general trend.”
This got me thinking about how common this kind of thing is and why? Why does a formula hold all the way up to 10^316 but then fail?
The essence of Busy Beaver numbers is that there are sequences based off of a simple formula/data that go on for a very long time and then just stop unpredictably. You can imagine replacing a simple formula with a simple theorem that appears to be true. Instead of it actually being true it is instead a way of encoding its very large counter example in a short amount of data.
If you think of it this way, a theorem that appears to be true and is evaluated over trillions of numbers is also instead a candidate to encode an exception at some very large number. In other words trillions of correct examples is only weak evidence of its correctness.
How much should we weight evaluation? We can’t evaluate to infinity and its obvious that a theorem being true to 2 million is not 2* evidence it is true at 1 million. Should we choose log(n)? A clear scale is the BB numbers themselves. e.g if your theorem is true up to BB(5) then that is 5 data points, rather than 47 million. Unlimited evaluation can never get to BB(6) so that is the limit of evidence from evaluation. (i.e. 5-6 evidence points with it being unclear how to weigh theory https://www.lesswrong.com/posts/MwQRucYo6BZZwjKE7/einstein-s-arrogance)
Now can we go meta?
Is some maths so much more powerful than others that it has equivalently greater weight as formal proof has to evaluation? Certainly some maths is more general than others. How does this effect common problems such as the Riemann Hypothesis—proving or disproving it affects a lot of maths. Showing it is correct to trillion zeros however is little evidence.
“Most mathematicians tend to believe that the Riemann Hypothesis is true, based on the weight of numerical evidence and its deep integration into existing mathematical frameworks.”
Is “deep integration” actually that deep, or is it the symbolic equivalent of evaluating up to 1 million? Perhaps just as you can find countless evaluated examples supporting a false theorem you can find much “deep integration” in favor of a famous theorem that could also be incorrect.
Further thoughts and links
Most people think P != NP, but what if
P = NP where N ~ BB(10)?
Proof was wrong—https://www.quantamagazine.org/mathematicians-prove-hawking-wrong-about-extremal-black-holes-20240821/
Related thoughts
Conservation of energy is a more general rule that rules out perpetual motion machines
2nd law of thermodynamics - likewise, HOWEVER that law must have been broken somehow to get a low entropy initial state for the Big Bang.
AI examples
1 The Polya Conjecture
Proposed by George Pólya in 1919, this conjecture related to the distribution of prime numbers. It posited that for any number 𝑥, the majority of the numbers less than 𝑥 have an odd number of prime factors. It was verified for numbers up to 1,500,000, but a counterexample was found when x was around 906 million. This shows a fascinating case where numerical verification up to a large number was still not sufficient.
2 Mertens Conjecture
The Mertens conjecture suggested that the absolute value of the Mertens function
M(x) is always less than sqrt(x)
This was proven false by computational means with a counterexample found above 10e14 by Andrew Odlyzko and Herman te Riele in 1985.
The value of BB(6) is not currently known, but it could in principle be discovered. There is no general algorithm for calculating BB numbers, but any particular BB(n) could be determined by enumerating all n-state Turing machines and proving whether each one halts.
According to Scott, “Pavel Kropitz discovered, a couple years ago, that BB(6) is at least 10^10^10^10^10^10^10^10^10^10^10^10^10^10^10 (i.e., 10 raised to itself 15 times).”
So we can never evaluate BB(6) as it is at least this large
Random ideas to expand on
https://www.theguardian.com/technology/2023/jul/21/australian-dishbrain-team-wins-600000-grant-to-develop-ai-that-can-learn-throughout-its-lifetime
https://newatlas.com/computers/human-brain-chip-ai/
https://newatlas.com/computers/cortical-labs-dishbrain-ethics/
Could this be cheaper than chips in an extreme silicon shortage? How did it learn, can we map connections forming and make better learning algorithms.
Birds vs ants/bees.
A flock of birds can be dumber than the dumbest individual bird, a colony of bees/ants can be smarter than than the individual, and smarter than a flock of birds! Bird avoiding predator in geometrical pattern—no intelligence as predictability like fluid has no processing. Vs bees swarming the scout hornet or ants building a bridge etc. Even though no planning in ants, no overall plan in individual neurons?
The more complex pieces the less well they fit together. Less intelligent units can form a better collective in this instance. Not like human orgs.
Progression from simple cell to mitochondria—mito have no say anymore but fit in perfectly. Multi organism like hive are next level up—simpler creatures can have more cohesion in upper level. Humans have more effective institutions in spite of complexity b/c of consciousness, language etc.
RISC vs CISC Intel vs NVIDIA, GPU for super computers. I though about this years ago, led to prediction that Intel or other CISC max business would lose to cheaper.
Time to communicate a positive singularity/utopia
Spheres of influence, like we already have, uncontacted tribes, Amish etc. Taking that further, Super AI must leave earth, perhaps solar system, enhanced ppl to of earth eco-system, space colonies, or Mars etc.
Take the best/happy nature to expand, don’t take suffering to >million stars.
Humans can’t do interstellar faster than AI anyway even if that was the goal, it would have to prepare it first, and can travel faster. So no question majority of humanity interstellar is AI. Need to keep earth for people. What is max CEV? Well keep earth ecosystem, humans can progress, discover on their own?
Is the progression to go outwards, human, posthuman/Neuralink, WBE? it is is some sci-fi Peter Hamilton/ Culture (human to WBE)
Long term all moral systems don’t know what to say on pleasure vs self determination/achievement. Eventually we run out of inventing things—should it go asymptotically slower.
Explores should be on the edge of civilization. For astronomers, shouldn’t celebrate JWST, but complain about Starlink—that is inconsistent. Edge of civilization has expanded past low earth orbit, that is why we get JWST. Obligation then to put telescopes further out.
Go to WBE instead of super AI—know for sure it is conscious.
Is industry, tech about making stuff less conscious with time? e.g. mechanical things have zero, vs a lot when done by people. Is that a principle for AI/robots? then there are no slaves etc.
Can ppl get behind this? - implied contract with future AI? acausal bargaining.
https://www.lesswrong.com/posts/qZJBighPrnv9bSqTZ/31-laws-of-fun
Turing test for WBE—how would you know?
Intelligence processing vs time
For search, exponential processing power gives linear increate in rating, Chess, Go. However this is a small search space. For life, does the search get bigger the further out you go.
e.g. 2 steps is 2^2 but 4 steps is 4^4. This makes sense if there are more things to consider the further ahead you look. e.g. house price for 1 month, general market, + economic trend. 10+ years then demographic trends, changing govt policy, unexpected changes in transport patterns, (new rail nearby or in competing suburb etc)
If applies to tech, then regular experiments shrink the search space, need physical experimentation to get ahead.
For AI, if its like intuition/search then need search to improve intuition. Can only learn from long term.
Long pause or not?
How long should we pause? 10 years? Even in stable society there is diminishing returns—seen this with pure maths, physics, philosophy, when we reach human limits, then more time simply doesn’t help. Reasonable to assume with CEV like concept also.
Pause carries danger? Is it like the clear pond before a rapid, are we already in the rapid, then trying to stop is dangerous having baby is fatal etc. “Emmett Shear” of go fast slow, stop, pause, Singularity seems ideal, though possible? WBE better than super AI—cultural as elder?
1984 quote “If you want a vision of the future, imagine a boot stamping on a human face—forever.”
“Heaven is high and the emperor is far away” is a Chinese proverb thought to have originated from Zhejiang during the Yuan dynasty.
Not possible earlier but is possible now. If democracies go to dictatorship but not back then pause is bad. Best way to keep democracies is to leave hence space colonies. Now in Xinjiang, the emperor is in your pocket, LLM can understand anything—how far back to go before this is not possible? 20 years, if not possible, then we are in the white water, and we need to paddle forwards, can’t stop.
Deep time breaks all common ethics?
Utility monster, experience machine, moral realism tiling the universe etc. Self determination and achievement will be in the extreme minority over many years. What to do, fake it forget it and keep achieving again? Just keep options open until we actually experience it.
All our training is about intrinsic motivation and valuing achievement rather than pleasure for its own sake. Great asymmetry in common thought “meaningless pleasure” makes sense and seems bad or not good, but “meaningless pain” doesn’t make it less bad. Why should that be the case. Evolution has biased us to not value pleasure or experience it as much as we “should”? Learn to take pleasure regard thinking “meaningless pleasure” is itself a defective attitude? If you could change yourself, should you dial down the need to achieve if you lived in a solved world?
What is “should” in is-ought. Moral realism in the limit? “Should” is us not trusting our reason, as we shouldn’t. If reason says one thing, then it could be flawed as it is in most cases. Especially as we evolved, then if we always trusted it, then mistakes are bigger than benefits, so the feeling “you don’t do what you should” is two systems competing, intuition/history vs new rational.
Is X.AI currently performing the largest training run?
This source claims it is
PC mag not so sure here, here
If so it seems to be getting a lot less attention compared to its compute capability.
Not sure if I have stated this before clearly but I believe scaling laws will not hold for LLM/Transformer type tech, and at least one major architectural advance is missing before AGI. That is increasing scaling of compute and data will plateau performance soon, and before AGI. Therefore I expect to see evidence for this not much after the end of this year, when large training runs yield models that are a lot more expensive to train, slower on inference and only a little better on performance. X.AI could be one of the first to publicly let this be known (Open AI, etc could very well be aware of this but not making it public)
Completion of the 100K H100s cluster seems to mean Grok-3 won’t be trained only on a smaller part of it, so it must be targeting all of it. But also Musk said Grok-3 is planned for end of 2024. So it won’t get more than about 2.7e26 FLOPs, about 14x GPT-4 (the training that started end of July could have just used a larger mini-batch size that anticipates the data parallelism needs of the larger cluster, so the same run could continue all the way from July to November). With 6 months of training on the whole cluster, it could instead get up to 5e26 FLOPs (25x GPT-4), but that needs to wait for another run.
OpenAI is plausibly training on Microsoft’s 100K H100s cluster since May, but there are also claims of the first run only using 10x GPT-4 compute, which is 2e26 FLOPs, so it’d take only 2-3 months and pretraining should’ve concluded by now. Additionally, it’s probably using synthetic data at scale in pretraining, so if that has an effect, Grok-3′s hypothetically similar compute won’t be sufficient to match the result.
On the other hand, with about 20K H100s, which is the scale that was offered at AWS in July 2023 and might’ve been available at Microsoft internally even earlier, it only takes 5 months to get 1e26 FLOPs. So GPT-4o might already be a 5x GPT-4 model. But it also could be an overtrained model (to get better inference efficiency), so not expected to be fundamentally much smarter.
On priors I think that Google Deepmind is currently running the biggest training run.
Google has very large datacenters, if measured in megawatts, but they are filled with older TPUs. Maybe they are fine compared to H100s on FLOP/joule basis though? In BF16, A100 (0.3e15 FLOP/s, 400W) to H100 (1e15 FLOP/s, 700W) to B100 (1.8e15 FLOP/s, 700W) notably improve FLOP/joule, but for recent TPUs TDP is not disclosed (and the corresponding fraction of the rest of the datacenter needs to be taken into account, for example it turns 700W of an H100 into about 1500W). In terms of FLOPS/GPU, only the latest generation announced in May 2024 matches H100s, it might take time to install enough of them.
They seem to have big plans for next year, but possibly they are not yet quite ready to be significantly ahead of 100K H100s clusters.
Thanks, that updates me. I’ve been enjoying your well-informed comments on big training runs, thank you!
Rootclaim covid origins debate:
This piece relates to this manifold market
and these videos
I listened to most of the 17+ hours of the debate and found it mostly interesting, informative and important for someone either interested in COVID origins or practicing rationality.
I came into this debate about 65-80% lab leak, and left feeling <10% is most likely.
Key takeaways
The big picture of the lab leak is easy to understand and sounds convincing, however the details don’t check out when put under scrutiny.
Both sides attempted Bayesian estimates and probabilities and got absolutely absurd differences in estimates.
Rootclaim failed to impress me—the takeaway I got is that they are well suited to say murder cases where there is history to go off, but when it comes to such a large messy, one-off event as COVID origins they didn’t know what evidence to include, how to properly weight it etc. They didn’t present a coherent picture of why we should accept their worldview and estimates. An example is where they asserted that even if Zoonosis was the origin then the claimed market was not the origin because the details of infected animals and humans wasn’t what they expected. This seems an absurd claim to make with confidence judging on the data available. When forced to build models (rather than rely on multiplying probabilities) they were bad at it and overconfident in their conclusions from such models.
More generally this led me to distrust Bayesian inference type methods in complicated situations. Two smart reasonably well prepared positions could be off by say >1e12 in relative estimates. Getting all the details right, building consistent models that are peer reviewed by experts cannot be made up for by giving uncertainties to things.
Regarding AI, I have now more sympathy to the claim that P(Doom) is a measure of how the individual feels, rather than a defensible position on what the odds actually are.