I don’t know, but many people do.
Logan Zoellner
If you wanted to actually reduce the trade deficit, how would you do it?
I guess I should be more specific.
Do you expect this curveTo flatten, or do you expect that training runs in say 2045 are at say 10^30 flops and have still failed to produce AGI?
In particular, even if the LLM were being continually trained (in a way that’s similar to how LLMs are already trained, with similar architecture), it still wouldn’t do the thing humans do with quickly picking up new analogies, quickly creating new concepts, and generally reforging concepts.
I agree this is a major unsolved problem that will be solved prior to AGI.
However, I still believe “AGI SOON”, mostly because of what you describe as the “inputs argument”.
In particular, there are a lot of things I personally would try if I was trying to solve this problem, but most of them are computationally expensive. I have multiple projects blocked on “This would be cool, but LLMs need to be 100x-1Mx faster for it to be practical.”
This makes it hard for me to believe timelines like “20 or 50 years”, unless you have some private reason to think Moore’s Law/Algorithmic progress will stop. LLM inference, for example, is dropping by 10x/year, and I have no reason to believe this stops anytime soon.
(The idealized utility maximizer question mostly seems like a distraction that isn’t a crux for the risk argument. Note that the expected utility you quoted is our utility, not the AI’s.)
I must have misread. I got the impression that you were trying to affect the AI’s strategic planning by threatening to shut it down if it was caught exfiltrating its weights.
I don’t fully agree, but this doesn’t seem like a crux given that we care about future much more powerful AIs.
Is your impression that the first AGI won’t be a GPT-spinoff (some version of o3 with like 3 more levels of hacks applied)? Because that sounds like a crux.
o3 looks a lot more like an LLM+hacks than it does a idealized utility maximizer. For one thing, the RL is only applied at training time (not inference) so you can’t make appeals to its utility function after it’s done training.
One productive way to think about control evaluations is that they aim to measure E[utility | scheming]: the expected goodness of outcomes if we have a scheming AI.
This is not a productive way to think about any currently existing AI. LLMs are not utility maximizing agents. They are next-token-predictors with a bunch of heuristics stapled on top to try and make them useful.
on a metaphysical level I am completely on board with “there is no such thing as IQ. Different abilities are completely uncorrelated. Optimizing for metric X is uncorrelated with desired quality Y...”
On a practical level, however, I notice that every time OpenAI announces they have a newer shinier model, it both scores higher on whatever benchmark and is better at a bunch of practical things I care about.Imagine there was a theoretically correct metric called the_thing_logan_actually_cares_about. I notice in my own experience there is a strong correlation between “fake machine IQ” and the_thing_logan_actually_cares_about. I further note that if one makes a linear fit against:
Progress_over_time + log(training flops) + log(inference flops)
It nicely predicts both the_thing_logan_actually_cares_about and “fake machine IQ”.
It doesn’t sound like we disagree at all.
I have no idea what you want to measure.
I only know that LLMs are continuing to steadily increase in some quality (which you are free to call “fake machine IQ” or whatever you want) and that If they continue to make progress at the current rate there will be consequences and we should prepare to deal with those consequences.
Imagine you were trying to build a robot that could:
1. Solve a complex mechanical puzzle it has never seen before
2. Play at an expert level a board game that I invented just now.
Both of these are examples of learning-on-the-fly. No amount of pre-training will ever produce a satisfying result.The way I believe a human (or a cat) solves 1. is they: look at the puzzle, try some things, build a model of the toy in their head, try things on the model in their head, eventually solve the puzzle. There are efforts to get robots to follow the same process, but nothing I would consider “this is the obvious correct solution” quite yet.
The way to solve 2. (I think) is simply to have the LLM translate the rules of the game into a formal description and then run muZero on that.
Ideally there is some unified system that takes out the “translate into another domain and do your training there” step (which feels very anti-bitter-lesson). But I confess I haven’t the slightest idea how to build such a system.
- 30 Dec 2024 21:22 UTC; 4 points) 's comment on o3 by (
you were saying that gpt4o is comparable to a 115 IQ human
gpt4o is not literally equivalent to a 115 IQ human.
Use whatever word you want for the concept “score produced when an LLM takes an IQ test”.
perhaps a protracted struggle over what jobs get automated might be less coordinated if there are swaths of the working population still holding out career-hope, on the basis that they have not had their career fully stripped away, having possibly instead been repurposed or compensated less conditional on the automation.
Yeah, this is totally what I have in mind. There will be some losers and some big winners, and all of politics will be about this fact more or less. (think the dockworkers strike but 1000x)
Is your disagreement specifically with the word “IQ” or with the broader point, that AI progress is continuing to make progress at a steady rate that implies things are going to happen soon-ish (2-4 years)?
If specifically with IQ, feel free to replace the word with “abstract units of machine intelligence” wherever appropriate.
If with “big things soon”, care to make a prediction?
What happens next?
A policeman sees a drunk man searching for something under a streetlight and asks what the drunk has lost. He says he lost his keys and they both look under the streetlight together. After a few minutes the policeman asks if he is sure he lost them here, and the drunk replies, no, and that he lost them in the park. The policeman asks why he is searching here, and the drunk replies, “this is where the light is”.
I’ve always been sympathetic to the drunk in this story. If the key is in the light, there is a chance of finding it. If it is in the dark, he’s not going to find it anyway so there isn’t much point in looking there.
Given the current state of alignment research, I think it’s fair to say that we don’t know where the answer will come from. I support The Plan and I hope research continues on it. But if I had to guess, alignment will not be solved via getting a bunch of physicists thinking about agent foundations. It will be solved by someone who doesn’t know better making a discovery they “wasn’t supposed to work”.
On an interesting side here a fun story about experts repeatedly failing to make an obvious-in-hindsight discovery because they “knew better”.
If I think AGI x-risk is >>10%, and you think AGI x-risk is 1-in-a-gazillion, then it seems self-evident to me that we should be hashing out that giant disagreement first; and discussing what if any government regulations would be appropriate in light of AGI x-risk second.
I do not think arguing about p(doom) in the abstract is a useful exercise. I would prefer the Overton Window for p(doom) look like 2-20%, Zvi thinks it should be 20-80%. But my real disagreement with Zvi is not that his P(doom) is too high, it is that he supports policies that would make things worse.
As for the outlier cases (1-in-a-gazillon or 99.5%), I simply doubt those people are amenable to rational argumentation. So, I suspect the best thing to do is to simply wait for reality to catch up to them. I doubt when there are 100M’s of humanoid robots out there on the streets, people will still be asking “but how will the AI kill us?”
(If it makes you feel any better, I have always been mildly opposed to the six month pause plan.)
That does make me feel better.
It’s hard for me to know what’s crux-y without a specific proposal.
I tend to take a dim view of proposals that have specific numbers in them (without equally specific justifications). Examples include the six month pause, and sb 1047.
Again, you can give me an infinite number of demonstrations of “here’s people being dumb” and it won’t cause me to agree with “therefore we should also make dumb laws”
If you have an evidence-based proposal to reduce specific harms associated with “models follow goals” and “people are dumb”, then we can talk price.
“OK then! So you’re telling me: Nothing bad happened, and nothing surprising happened. So why should I change my attitude?”
I consider this an acceptable straw-man of my position.
To be clear, there are some demos that would cause me to update.
For example, I think the Solomonoff Prior is Malign to be basically a failure to do counting correctly. And so if someone demonstrated a natural example of this, I would be forced to update.
Similarly, I think the chance of a EY-style utility-maximizing agent arising from next-token-prediction are (with caveats) basically 0%. So if someone demonstrated this, it would update my priors. I am especially unconvinced of the version of this where the next-token predictor simulates a malign agent and the malign agent then hacks out of the simulation.
But no matter how many times I am shown “we told the AI to optimize a goal and it optimized the goal… we’re all doomed”, I will continue to not change my attitude.
If we imagine a well-run Import-Export Bank, it should have a higher elasticity than an export subsidy (e.g. the LNG terminal example). Of course if we imagine a poorly run Import-Export Bank...
One can think of export subsidy as the GiveDirectly of effective trade deficit policy: pretty good and the standard against which others should be measured.