9 respondents were concerned about an overreliance or overemphasis on certain kinds of theoretical arguments underpinning AI risk
I agree with this, but that “the horsepower of AI is instead coming from oodles of training data” is not a fact that seems relevant to me, except in the sense that this is driving up AI-related chip manufacturing (which, however, wasn’t mentioned). The reason I argue it’s not otherwise relevant is that the horsepower of ASI will not, primarily, come from oodles of training data. To the contrary, it will come from being able to reason, learn and remember better than humans do, and since (IIUC) LLMs function poorly if they are trained only on a dataset sized for human digestion, this implies AGI and ASI need less training data than LLMs, probably much less, for a given performance level (which is not to say more data isn’t useful to them, it’s just not what makes them AGI). So in my view, making AGI (and by extension AGI alignment) is mainly a matter of algorithms that have not yet been invented and therefore cannot be empirically tested, and less a matter of training data. Making ASI, in turn, is mainly a matter of compute (which already seems too abundant).
(disclaimer: I’m not an AI expert. Also it’s an interesting question whether OpenAI will find a trick that somehow turns LLMs into AGI with little additional innovation, but supposing that’s true, does the AGI alignment community have enough money and compute to do empirical research in the same direction, given the disintegration of OpenAI’s Superalignment Team?)
I agree the first AGIs probably won’t be epistemically sound agents maximizing an objective function: even rationalists have shown little interest in computational epistemology, and the dangers of maximizers seem well-known so I vaguely doubt leading AGI companies are pursuing that approach. Epistemically-poor agents without an objective function are often quite dangerous even with modest intelligence, though (e.g. many coup attempts have succeeded on the first try). Capabilities people seem likely to try human-inspired algorithms, which argues for alignment research along the same lines, but I’m not sure if this will work:
A random alignment researcher would likely invent something different than a random (better-funded) capabilities researcher, so they’d end up aligning the “wrong” architecture (edit: though I expect some transferability of whatever was learned about alignment to comparable architectures)
Like GoF research, developing any AGI has a risk that the key insight behind it, or even the source code, escapes into the wild before alignment is solved
Solving alignment doesn’t force others to use the solution.
So while the criticism seems sound, what should alignment researchers do instead?
Other brief comments:
Too insular: yes, ticking off “AI safety” and e/acc researchers does us no favors ― but to what extent can it be avoided?
Bad messaging―sure, can we call it AGI alignment please? And I think a thing we should do even for our own benefit is to promote detailed and realistic stories (ideally after review by superforecasters and alignment researchers) of how an AGI world could play out. Well-written stories are good messaging, and Moloch knows human fiction needs more realism. (I tried to write such a story, had no time to finish it, so published a draft before noticing the forum’s special rules, and it was not approved for publication) P.S. It doesn’t sound like Evan and Krueger contradict each other. P.P.S. a quip for consideration: psychopathy is not the presence of evil, but the absence of empathy. Do you trust AI corporations to build empathy? P.P.P.S. what about mimetic countermessaging? I don’t know where the “AI safety is a immense juggernaut turning everyone into crazy doomers” meme comes from (Marc Andreesson?) but it seems very popular.
“Their existence is arguably making AGI come sooner, and fueling a race that may lead to more reckless corner-cutting”: yes, but (1) was it our fault specifically and (2) do we have a time machine?
“general mistrust of governments in rationalist circles, not enough faith in our ability to solve coordination problems, and a general dislike of “consensus views”” Oh hell yes. EAs have a bit different disposition, though?
I agree with this, but that “the horsepower of AI is instead coming from oodles of training data” is not a fact that seems relevant to me, except in the sense that this is driving up AI-related chip manufacturing (which, however, wasn’t mentioned). The reason I argue it’s not otherwise relevant is that the horsepower of ASI will not, primarily, come from oodles of training data. To the contrary, it will come from being able to reason, learn and remember better than humans do, and since (IIUC) LLMs function poorly if they are trained only on a dataset sized for human digestion, this implies AGI and ASI need less training data than LLMs, probably much less, for a given performance level (which is not to say more data isn’t useful to them, it’s just not what makes them AGI). So in my view, making AGI (and by extension AGI alignment) is mainly a matter of algorithms that have not yet been invented and therefore cannot be empirically tested, and less a matter of training data. Making ASI, in turn, is mainly a matter of compute (which already seems too abundant).
(disclaimer: I’m not an AI expert. Also it’s an interesting question whether OpenAI will find a trick that somehow turns LLMs into AGI with little additional innovation, but supposing that’s true, does the AGI alignment community have enough money and compute to do empirical research in the same direction, given the disintegration of OpenAI’s Superalignment Team?)
I agree the first AGIs probably won’t be epistemically sound agents maximizing an objective function: even rationalists have shown little interest in computational epistemology, and the dangers of maximizers seem well-known so I vaguely doubt leading AGI companies are pursuing that approach. Epistemically-poor agents without an objective function are often quite dangerous even with modest intelligence, though (e.g. many coup attempts have succeeded on the first try). Capabilities people seem likely to try human-inspired algorithms, which argues for alignment research along the same lines, but I’m not sure if this will work:
A random alignment researcher would likely invent something different than a random (better-funded) capabilities researcher, so they’d end up aligning the “wrong” architecture (edit: though I expect some transferability of whatever was learned about alignment to comparable architectures)
Like GoF research, developing any AGI has a risk that the key insight behind it, or even the source code, escapes into the wild before alignment is solved
Solving alignment doesn’t force others to use the solution.
So while the criticism seems sound, what should alignment researchers do instead?
Other brief comments:
Too insular: yes, ticking off “AI safety” and e/acc researchers does us no favors ― but to what extent can it be avoided?
Bad messaging―sure, can we call it AGI alignment please? And I think a thing we should do even for our own benefit is to promote detailed and realistic stories (ideally after review by superforecasters and alignment researchers) of how an AGI world could play out. Well-written stories are good messaging, and Moloch knows human fiction needs more realism. (I tried to write such a story, had no time to finish it, so published a draft before noticing the forum’s special rules, and it was not approved for publication) P.S. It doesn’t sound like Evan and Krueger contradict each other. P.P.S. a quip for consideration: psychopathy is not the presence of evil, but the absence of empathy. Do you trust AI corporations to build empathy? P.P.P.S. what about mimetic countermessaging? I don’t know where the “AI safety is a immense juggernaut turning everyone into crazy doomers” meme comes from (Marc Andreesson?) but it seems very popular.
“Their existence is arguably making AGI come sooner, and fueling a race that may lead to more reckless corner-cutting”: yes, but (1) was it our fault specifically and (2) do we have a time machine?
“general mistrust of governments in rationalist circles, not enough faith in our ability to solve coordination problems, and a general dislike of “consensus views”” Oh hell yes. EAs have a bit different disposition, though?