Vladimir_Nesov comments on DragonGod’s Shortform

Vladimir_Nesov 30 Dec 2022 23:17 UTC
4 points
There seems to be a lot of giant cheesecake fallacy in AI risk. Only things leading up to AGI threshold are relevant to the AI risk faced by humans, the rest is AGIs’ problem.

Given current capability of ChatGPT with imminent potential to get it a day-long context window, there is nothing left but tuning, including self-tuning, to reach AGI threshold. There is no need to change anything at all in its architecture or basic training setup to become AGI, only that tuning to get it over a sanity/agency threshold of productive autonomous activity, and iterative batch retraining on new self-written data/reports/research. It could be done much better in other ways, but it’s no longer necessary to change anything to get there.

So AI risk is now exclusively about fine tuning of LLMs, anything else is giant cheesecake fallacy, something possible in principle but not relevant now and thus probably ever, as something humanity can influence. Though that’s still everything but the kitchen sink, fine tuning could make use of any observations about alignment, decision theory, and so on, possibly just as informal arguments being fed at key points to LLMs, cumulatively to decisive effect.
What links here?
- Is ChatGPT TAI? by Amal (30 Dec 2022 19:44 UTC; 14 points)
- Vladimir_Nesov's comment on 2022 was the year AGI arrived (Just don’t call it that) by Logan Zoellner (4 Jan 2023 19:52 UTC; 9 points)