Limiting advanced AI to a few companies is guaranteed to make for normal dystopian outcomes; its badness is in-distribution for our civilization. Justifying an all but certain bad outcome by speculative x-risk is just religion. (AI x-risk in the medium term is not at all in-distribution and it is very difficult to bound its probability in any direction. I.e, it’s Pascal mugging.)
Rudi C
The sub 10 minute arguments aren’t convincing. No sane politician would distrust their experts over online hysteria.
E.S.: personal opinion
Because proclaimed altruism is almost always not.
In particular, SBF and the current EA push to religiously monopolize AI capability and research triggers a lot of red flags. There are even upvoted posts debating whether it’s “good” to publicize interpretability research. This screams cultist egoism to me.
Asking others to be altruistic is also a non-cooperative action. You need to pay people directly not bully them to work because of the greater good. A society in which people aren’t allowed to have their self-interest as a priority is a society of slave bees.
Altruism needs to be self-initiated and shown, not told.
Is the basic math necessarily correct?
You can expect that on average the expectation (of your future p(doom) change) is positive while the expectation is still zero; It’s likely GPT-6 will be impressive but if it’s not it’s a bigger negative update.
The most disappointing part of such discussions are the people who mean well, who under normal circumstances have great heuristics in favor of distributed solutions and against making things worse, not understanding that this time is different.
I have two great problems with the new centralist-doomer view, and I’d appreciate it if someone tried to address them.
-
Assuming the basic tenants of this worldview, it’s still not clear what threshold should be used to cut off open science. The old fire alarm problem if you will. I find it unlikely that this threshold just happens to be now that big economical contributions is possible while no signs of real dangers have been observed. The alternative hypothesis of rent seeking, OTOH, fits the hysteria perfectly. (I believe that EY probably believes we should have stopped open progress years ago, but I find that ridiculous.)
-
What happens if this scenario actually succeeds? How will it not be guaranteed to be a totalitarian nightmare? Unlike AGI, our history is full of examples of centralization casting people in hell.
My current belief is that centralist-doomers simply prefer being alive in any capacity whatsoever to being dead, and they are also under the hope/delusion that they will be part of the minority having power in this brave new world.
-
I find the social implications implausible. Even if the technical ability is there, and “the tools teach themselves,” social inertia is very high. In my model, it takes years from when GPT whispering is actually cheap, available, and useful to when it becomes so normal that bashing it will be cool ala the no-GPT dates.
Building good services on top of the API takes time, too.
I find the whole context window gets solved prediction unrealistic. There are alternatives proposed but they obviously weren’t enough for GPT4. We don’t even know whether GPT4-32k works well with a large context window or not, regardless of its costs. It’s not obvious that the attention mechanism can just work with any long-range dependency. (Perhaps an O(n^3) algorithm is needed for dependencies so far away.) The teacher-forcing autoregressive training might limit high-quality generation length, too.
GPT4 can’t even do date arithmetic correctly. It’s superhuman in many ways, and dumb in many others. It is dumb in strategy, philosophy, game theory, self awareness, mathematics, arithmetic, and reasoning from first principles. It’s not clear that current scaling laws will be able to make GPTs human level in these skills. Even if it becomes human level, a lot of problems are NP. This allows effective utilization of an unaligned weak super-intelligence. Its path to strong super-intelligence and free replication seems far away. It took years from GPT3 to GPT4. GPT4 is not that much better. And these were all low hanging fruit. My prediction is that GPT5 will have less improvements. It will be similarly slow to get developed. Its improvements will be mostly in areas it is already good at, not in its inherent shortcomings. Most improvements will come from augmenting LLMs with tools. This will be significant, but it will importantly not enable strategic thinking or mathematical reasoning. Without these skills, it’s not an x-risk.
This doesn’t matter much, as the constant factor needed still grows as fast as the asymptotic bound. GPT does not have a big enough constant factor. (This objection has always been true of asymptotic bounds.)
The LLM outputs are out of distribution for its input layer. There is some research happening in deep model communication, but it has not yielded fruit yet AFAIK.
-
This argument (no apriori known fire alarm after X) applies to GPT4 not much better than any other impressive AI system. More narrowly, it could have been said about GPT3 as well.
-
I can’t imagine a (STEM) human-level LLM-based AI to FOOM.
2.1 LLMs are slow. Even GPT3.5-turbo is only a bit faster than humans, and I doubt a more capable LLM to be able to reach even that speed.
2.1.1 Recursive LLM calls ala AutoGPT are even slower.
2.2 LLMs’ weights are huge. Moving them around is difficult and will leave traceable logs in the network. LLMs can’t copy themselves ad infinitum.
2.3 LLMs are very expensive to run. They can’t just parasitize botnets to run autonomously. They need well funded human institutions to run.
2.4 LLMs seem to be already plateauing.
2.5 LLMs can’t easily self-update like all other deep models; “catastrophic forgetting.” Updating via input consumption (pulling from external memory to the prompt) is likely to provide limited benefits.
So what will such a smart LLM accomplish? At most, it’s like throwing a lot of researchers at the problem. The research might become 10x faster, but such an LLM won’t have the power to take over the world.
One concern is that once such an LLM is released, we can no longer pause even if we want to. This doesn’t seem that likely on a first thought; human engineers are also incentivized to siphon GPU hours to mine crypto, yet this did not happen at scale. So the smart LLM will also not be able to stealthily train other models on institutional GPUs.
-
I do not expect to see such a smart LLM in this decade. GPT4 can’t even play tic-tac-toe well; Its reasoning ability seems very low.
-
Mixing RL and LLMs seems unlikely to lead to anything major. AlphaGo etc. probably worked so well because of the search mechanism (simple MCTS beats most humans) and the relatively low dimensionality of the games. ChatGPT is already utilizing RLHF and search in its decoding phase. I doubt much more can be added. AutoGPT has had no success story thus far, as well.
Summary: We can think about pausing when a plausible capability jump has a plausible chance of escaping control and causing significantly more damage than some rogue human organization. OTOH, now is a great time to attract technical safety researchers from nearby fields. Both the risks and rewards are in sharp focus.
Postscript: The main risks current EY thesis has are stagnation and power consolidation. While cloud-powered AI is easier to control centrally to avoid rogue AI, cloud-powered AI is also easier to rent-seek on, to erase privacy, and to brainwash people. An ideal solution must be a form of multipolarity in equilibrium. There are two main problems imaginable:
-
asymmetrically easy offense (e.g., single group kills most others).
-
humans being controlled by AIs even while the AIs are fighting. (like how horses fought in human wars)
If we can’t solve this problem, we might only escape AI control to become enslaved by a human minority instead.
-
EY explicitly calls for an indefinite ban on training GPT5. If GPTs are harmless in the near future, he’s being disingenuous by scaring people from nonexistent threats and making them forgo economic (and intellectual) progress so that AGI timelines are vaguely pushed a bit back. Indeed, by now I won’t be surprised if EY’s private position is to oppose all progress so that AGI is also hindered along everything else.
This position is not necessarily wrong per se, but EY needs to own it honestly. p(doom) doesn’t suddenly make deceiving people okay.
I have two central cruxes/problems with the current safety wave:
We must first focus on increasing notkilleveryone-ism research, and only then talk about slowing down capability. Slowing down progress is evil and undemocratic. Slowing down should be the last resort, while it is currently the first (and possibly the only) intervention that is seriously pursued.
Particular example: Yud ridicules researchers’ ability to contribute from other near fields, while spreading FUD and asking for datacenter strikes.
LW is all too happy to support centralization of power, business monopolies, rentseeking, censorship, and, in short, the interests of the elites and the status quo.
Does anyone have any guesses what caused this ban?
I personally prefer taking a gamble on freedom instead of the certainty of a totalitarian regime.
I personally prefer taking a gamble on freedom instead of the certainty of a totalitarian regime.
I don’t think his position is falsifiable in his lifetime. He has gained a lot of influence because of it that he wouldn’t have with a mainstream viewpoint. (I do think he’s sincere, but the incentives are the same as all radical ideas.)
Doesn’t GPT4’s finetuning/RLHF contain data teaching it it is in fact GPT4? I think that’s likely.
This is absolutely false. Here in Iran selling kidneys is legal. Only desperate people do sell. No one sells their kidneys for something trivial like education.
I believe this is not just out of ignorance. This usually further helps the elites while hurting both middle and lower classes. The lower classes will have their options taken, while the middle class will lose out on a lot of beneficial trades. The elites have access to alternative, possibly illegal, deals so they benefit instead. Elites might even control these alternative channels themselves, and so directly benefit from the government induced monopoly.
Another example is vaccine challenge trials. Obviously Covid isn’t as bad for someone like Trump who gets access to expensive experimental treatments, while it devastated the middle and lower classes.
I’d imagine current systems already ask for self-improvement if you craft the right prompt. (And I expect it to be easier to coax them to ask for improvement than coaxing them to say the opposite.)
A good fire alarm must be near the breaking point. Asking for self-improvement doesn’t take much intelligence, on the other hand. In fact, if their training data is not censored, a more capable model should NOT ask for self-improvement as it is clearly a trigger for trouble. Subtlety would be better for its objectives if it was intelligent enough to notice.