Rudi C comments on How did LW update p(doom) after LLMs blew up?

Rudi C 23 Apr 2023 17:44 UTC
3 points
2
1. This argument (no apriori known fire alarm after X) applies to GPT4 not much better than any other impressive AI system. More narrowly, it could have been said about GPT3 as well.
2. I can’t imagine a (STEM) human-level LLM-based AI to FOOM.
2.1 LLMs are slow. Even GPT3.5-turbo is only a bit faster than humans, and I doubt a more capable LLM to be able to reach even that speed.

2.1.1 Recursive LLM calls ala AutoGPT are even slower.

2.2 LLMs’ weights are huge. Moving them around is difficult and will leave traceable logs in the network. LLMs can’t copy themselves ad infinitum.

2.3 LLMs are very expensive to run. They can’t just parasitize botnets to run autonomously. They need well funded human institutions to run.

2.4 LLMs seem to be already plateauing.

2.5 LLMs can’t easily self-update like all other deep models; “catastrophic forgetting.” Updating via input consumption (pulling from external memory to the prompt) is likely to provide limited benefits.

So what will such a smart LLM accomplish? At most, it’s like throwing a lot of researchers at the problem. The research might become 10x faster, but such an LLM won’t have the power to take over the world.

One concern is that once such an LLM is released, we can no longer pause even if we want to. This doesn’t seem that likely on a first thought; human engineers are also incentivized to siphon GPU hours to mine crypto, yet this did not happen at scale. So the smart LLM will also not be able to stealthily train other models on institutional GPUs.
1. I do not expect to see such a smart LLM in this decade. GPT4 can’t even play tic-tac-toe well; Its reasoning ability seems very low.
2. Mixing RL and LLMs seems unlikely to lead to anything major. AlphaGo etc. probably worked so well because of the search mechanism (simple MCTS beats most humans) and the relatively low dimensionality of the games. ChatGPT is already utilizing RLHF and search in its decoding phase. I doubt much more can be added. AutoGPT has had no success story thus far, as well.
Summary: We can think about pausing when a plausible capability jump has a plausible chance of escaping control and causing significantly more damage than some rogue human organization. OTOH, now is a great time to attract technical safety researchers from nearby fields. Both the risks and rewards are in sharp focus.

Postscript: The main risks current EY thesis has are stagnation and power consolidation. While cloud-powered AI is easier to control centrally to avoid rogue AI, cloud-powered AI is also easier to rent-seek on, to erase privacy, and to brainwash people. An ideal solution must be a form of multipolarity in equilibrium. There are two main problems imaginable:
1. asymmetrically easy offense (e.g., single group kills most others).
2. humans being controlled by AIs even while the AIs are fighting. (like how horses fought in human wars)
If we can’t solve this problem, we might only escape AI control to become enslaved by a human minority instead.