“Despite all the reasons we should believe that we are fucked, there might just be missing some reasons we don’t yet know for why everything will all go alright” is a really poor argument IMO.
...AI that is smart enough to discover new physics may also discover separate and efficient physical resources for what it needs, instead of grabby-alien-style lightconing it through the Universe.
This especially feels A LOT like you are starting from hopes and rationalizing them. We have veeeeery little reasons to believe that might be true… and also you just want to abandon that resource-rich physics to the AI instead to be used by humans to live nicely?
I think Yudkowsky put it nicely in this tweet while arguing with Ajeya Cotra:
Look, from where I stand, it’s obvious from my perspective that people are starting from hopes and rationalizing them, rather than neutrally extrapolating forward without hope or fear, and the reason you can’t already tell me what value was maxed out by keeping humans alive, and what condition was implied by that, is that you started from the conclusion that we were being kept alive, and didn’t ask what condition we were being kept alive in, and now that a new required conclusion has been added—of being kept alive in good condition—you’ve got to backtrack and rationalize some reason for that too, instead of just checking your forward prediction to find what it said about that.
I feel like briefly discussing every point on the object level (even though you don’t offer object level discussion: you don’t argue why the things you list are possible, just that they could be):
...Recursive self-improvement is an open research problem, is apparently needed for a superintelligence to emerge, and maybe the problem is really hard.
It is not necessary. If the problem is easy we are fucked and should spend time thinking about alignment, if it’s hard we are just wasting some time thinking about alignment (it is not a Pascal mugging). This is just safety mindset and the argument works for almost every point to justify alignment research, but I think you are addressing doom rather than the need for alignment.
The short version of RSI is: SI seems to be a cognitive process, so if something is better at cognition it can SI better. Rinse and repeat. The long version. I personally think that just the step from from neural nets to algorithms (which is what perfectly successful interpretability would imply) might be enough to have dramatic improvement on speed and cost. Enough to be dangerous, probably even starting from GPT-3.
...Pushing ML toward and especially past the top 0.1% of human intelligence level (IQ of 160 or something?) may require some secret sauce we have not discovered or have no clue that it would need to be discovered.
...An example of this might be a missing enabling technology, like internal combustion for heavier-than-air flight (steam engines were not efficient enough, though very close). Or like needing the Algebraic Number Theory to prove the Fermat’s last theorem. Or similar advances in other areas.
...Improvement AI beyond human level requires “uplifting” humans along the way, through brain augmentation or some other means.
This has been claimed time and time again, people thinking this, just 3 years ago, would have predicted GPT-4 to be impossible without many breakthroughs. ML hasn’t hit a wall yet, but maybe soon?
Without it, we would be stuck with ML emulating humans, but not really discovering new math, physics, chemistry, CS algorithms or whatever.
What are you actually arguing? You seem to imply that humans don’t discover new math, physics, chemistry, CS algorithms...? 🤔 AGI (not ASI) are still plenty dangerous because they are in silicon. Compared to bio-humans they don’t sleep, don’t get tired, have speed advantage, ease of communication between each other, ease of self-modification (sure, maybe not foom-style RSI, but self-mod is on the table), self-replication not constrained by willingness to have kids, a lot of physical space, food, health, random IQ variance, random interest and without needing the slow 20-30 years of growth needed for humans to be productive. GPT-4 might not write genius-level code, but it does write code faster than anyone else.
...Agency and goal-seeking beyond emulating what humans mean by it informally might be hard, or not being a thing at all, but just a limited-applicability emergent concept, sort of like the Newtonian concept of force (as in F=ma).
Why do you need something that goal-seeks beyond what human informally mean?? Have you seen AutoGPT? What happened whit AutoGPT when GPT gets smarter? Why would GPT-6+AutoGPT not be a potentially dangerous goal-seeking agent?
...We may be fundamentally misunderstanding what “intelligence” means, if anything at all. It might be the modern equivalent of the phlogiston.
People, very smart people, are really trying to build superintelligence. Are you really betting against human ingenuity?
I’m sorry if I sounded aggressive in some of this points, but from where I stand this arguments don’t seem to be well though out, and I don’t want to spend more time on this comment six people will see and two read.
Well, I apologized for the aggressiveness/rudeness, but I am interested if I am mischaracterizing your position or if you really disagree with any particular “counter-argument” I have made.
You might object that OP is not producing the best arguments against AI-doom. In which case I ask, what are the best arguments against AI-doom?
I am honestly looking for them too.
The best I, myself, can come up with are brief light of “maybe the ASI will be really myopic and the local maxima for its utility is a world where humans are happy long enough to figure out alignment properly, and maybe the AI will be myopic enough that we can trust its alignment proposals”, but then I think that the takeoff is going to be really fast and the AI would just self-improve until it is able to see where the global maximum lies (also because we want to know how the best world for humans looks like, we don’t really want a myopic AI), except that that maximum will not be aligned.
I guess a weird counter argument to AI-doom, is “humans will just not build the Torment Nexus™ because they realize alignment is a real thing and they have a too high chance (>0.1%) of screwing up”, but I doubt that.
Thanks for the list, I’ve already read a lot of those posts, but I still remain unconvinced. Are you convinced by any of those arguments? Do you suggest I take a closer look to some posts?
But honestly, with the AI risk statement signed by so many prominent scientists and engineer, debating that AI risks somehow don’t exists seems to be just a fringe anti-climate-change-like opinion held by few stubborn people (or people just not properly introduced to the arguments). I find it funny that we are in a position where in the possible counter arguments appears “angels might save us”, thanks for the chuckle.
To be fair I think this post argues about how overconfident Yudkosky is at placing doom at 95%+, and sure, why not… But, as a person that doesn’t want to personally die, I cannot say that “it will be fine” unless I have good arguments as to why the p(doom) should be less than 0.1% and not “only 20%”!
But honestly, with the AI risk statement signed by so many prominent scientists and engineer,
Well, yes, the statement says “should be a global priority alongside other societal-scale risks”, not anything about brakes on capabilities research, or privileging this risk over others. This is not at all the cautionista stance. Not even watered down. It is only to raise the public profile of this particular x-risk existence.
I don’t get you. You are upset about people saying that we should scale back capabilities research, while at the same time holding the opinion that we are not doomed because we won’t get to ASI? You are worried that people might try to stop the technology that in your opinion may not happen?? The technology that if does indeed happen, you agree that “If [ASI] us wants us gone, we would be gone”?!?
Said this, maybe you are misunderstanding the people that are calling for a stop. I don’t think anyone is proposing to stop narrow AI capabilities. Just the dangerous kind of general intelligence “larger than GPT-4”. Self-driving cars good, automated general decision-making bad.
I’d also still like to hear your opinion on my counter arguments on the object level.
I did listen to that post, and while I don’t remember all the points, I do remember that it didn’t convince me that alignment is easy and, like Christiano’s post “Where I agree and disagree with Eliezer”, it just seems to be like “p(doom) of 95%+ plus is too much, it’s probably something like 10-50%” which is still incredibly unacceptably high to continue “business as usual”. I have faith that something will be done: regulation and breakthrough will happen, but it seems likely that it won’t be enough.
It comes down to safety mindset. There are very few and sketchy reasons to expect that by default an ASI will care about humans enough, so it not safe to build one until shown otherwise (preferably without actually creating one). And if I had to point out a single cause for my own high p(doom), it is the fact that we humans iterate all of our engineering to iron out all of the kinks, while with a technology that is itself adversarial, iteration might not be available (get it right the first time we deploy powerful AI).
Who do you think are the two or three smartest people to be skeptical of AI killing all humans? I think maybe Yann LeCunn and Andrew Ng.
Sure, those two. I don’t know about Ng (he recently had a private discussion with Hinton, but I don’t know what he thinks now), but I know LeCun hasn’t really engaged with the ideas and just relies on the concept that “it’s an extreme idea”. But as I said, having the position “AI doesn’t pose an existential threat” seems to be fringe nowadays.
If I dumb the argument down enough I get stuff like “intelligence/cognition/optimization is dangerous, and, whatever the reasons, we currently have zero reliable ideas on how to make a powerful general intelligence safe (eg. RLHF doesn’t work well enough as GPT-4 still lies/hallucinates and is jailbroken way too easily)” which is evidence based, not weird and not extreme.
“Despite all the reasons we should believe that we are fucked, there might just be missing some reasons we don’t yet know for why everything will all go alright” is a really poor argument IMO.
This especially feels A LOT like you are starting from hopes and rationalizing them. We have veeeeery little reasons to believe that might be true… and also you just want to abandon that resource-rich physics to the AI instead to be used by humans to live nicely?
I think Yudkowsky put it nicely in this tweet while arguing with Ajeya Cotra:
I feel like briefly discussing every point on the object level (even though you don’t offer object level discussion: you don’t argue why the things you list are possible, just that they could be):
It is not necessary. If the problem is easy we are fucked and should spend time thinking about alignment, if it’s hard we are just wasting some time thinking about alignment (it is not a Pascal mugging). This is just safety mindset and the argument works for almost every point to justify alignment research, but I think you are addressing doom rather than the need for alignment.
The short version of RSI is: SI seems to be a cognitive process, so if something is better at cognition it can SI better. Rinse and repeat. The long version.
I personally think that just the step from from neural nets to algorithms (which is what perfectly successful interpretability would imply) might be enough to have dramatic improvement on speed and cost. Enough to be dangerous, probably even starting from GPT-3.
This has been claimed time and time again, people thinking this, just 3 years ago, would have predicted GPT-4 to be impossible without many breakthroughs. ML hasn’t hit a wall yet, but maybe soon?
What are you actually arguing? You seem to imply that humans don’t discover new math, physics, chemistry, CS algorithms...? 🤔
AGI (not ASI) are still plenty dangerous because they are in silicon. Compared to bio-humans they don’t sleep, don’t get tired, have speed advantage, ease of communication between each other, ease of self-modification (sure, maybe not foom-style RSI, but self-mod is on the table), self-replication not constrained by willingness to have kids, a lot of physical space, food, health, random IQ variance, random interest and without needing the slow 20-30 years of growth needed for humans to be productive. GPT-4 might not write genius-level code, but it does write code faster than anyone else.
Why do you need something that goal-seeks beyond what human informally mean?? Have you seen AutoGPT? What happened whit AutoGPT when GPT gets smarter? Why would GPT-6+AutoGPT not be a potentially dangerous goal-seeking agent?
Do you really need to fundamentally understand fire to understand that it burns your house down and you should avoid letting it loose?? If we are wrong about intelligence… what? The superintelligence might not be smart?? Are you again arguing that we might not create a ASI soon?
I feel like the answers is just: “I think that probably some of the vast quantities of money being blindly piled it blindly and helplessly piled into here are going to end up actually accomplishing something”
People, very smart people, are really trying to build superintelligence. Are you really betting against human ingenuity?
I’m sorry if I sounded aggressive in some of this points, but from where I stand this arguments don’t seem to be well though out, and I don’t want to spend more time on this comment six people will see and two read.
You make a few rather strong statements very confidently, so I am not sure if any further discussion would be productive.
Well, I apologized for the aggressiveness/rudeness, but I am interested if I am mischaracterizing your position or if you really disagree with any particular “counter-argument” I have made.
I am honestly looking for them too.
The best I, myself, can come up with are brief light of “maybe the ASI will be really myopic and the local maxima for its utility is a world where humans are happy long enough to figure out alignment properly, and maybe the AI will be myopic enough that we can trust its alignment proposals”, but then I think that the takeoff is going to be really fast and the AI would just self-improve until it is able to see where the global maximum lies (also because we want to know how the best world for humans looks like, we don’t really want a myopic AI), except that that maximum will not be aligned.
I guess a weird counter argument to AI-doom, is “humans will just not build the Torment Nexus™ because they realize alignment is a real thing and they have a too high chance (>0.1%) of screwing up”, but I doubt that.
Thanks for the list, I’ve already read a lot of those posts, but I still remain unconvinced. Are you convinced by any of those arguments? Do you suggest I take a closer look to some posts?
But honestly, with the AI risk statement signed by so many prominent scientists and engineer, debating that AI risks somehow don’t exists seems to be just a fringe anti-climate-change-like opinion held by few stubborn people (or people just not properly introduced to the arguments). I find it funny that we are in a position where in the possible counter arguments appears “angels might save us”, thanks for the chuckle.
To be fair I think this post argues about how overconfident Yudkosky is at placing doom at 95%+, and sure, why not… But, as a person that doesn’t want to personally die, I cannot say that “it will be fine” unless I have good arguments as to why the p(doom) should be less than 0.1% and not “only 20%”!
Well, yes, the statement says “should be a global priority alongside other societal-scale risks”, not anything about brakes on capabilities research, or privileging this risk over others. This is not at all the cautionista stance. Not even watered down. It is only to raise the public profile of this particular x-risk existence.
I don’t get you. You are upset about people saying that we should scale back capabilities research, while at the same time holding the opinion that we are not doomed because we won’t get to ASI? You are worried that people might try to stop the technology that in your opinion may not happen?? The technology that if does indeed happen, you agree that “If [ASI] us wants us gone, we would be gone”?!?
Said this, maybe you are misunderstanding the people that are calling for a stop. I don’t think anyone is proposing to stop narrow AI capabilities. Just the dangerous kind of general intelligence “larger than GPT-4”. Self-driving cars good, automated general decision-making bad.
I’d also still like to hear your opinion on my counter arguments on the object level.
Consider also reading Scott Aaronson (whose sabbatical at OpenAI is about to end):
https://scottaaronson.blog/?p=7266
https://scottaaronson.blog/?p=7230
https://scottaaronson.blog/?p=7174
https://scottaaronson.blog/?p=7064
I did listen to that post, and while I don’t remember all the points, I do remember that it didn’t convince me that alignment is easy and, like Christiano’s post “Where I agree and disagree with Eliezer”, it just seems to be like “p(doom) of 95%+ plus is too much, it’s probably something like 10-50%” which is still incredibly unacceptably high to continue “business as usual”. I have faith that something will be done: regulation and breakthrough will happen, but it seems likely that it won’t be enough.
It comes down to safety mindset. There are very few and sketchy reasons to expect that by default an ASI will care about humans enough, so it not safe to build one until shown otherwise (preferably without actually creating one). And if I had to point out a single cause for my own high p(doom), it is the fact that we humans iterate all of our engineering to iron out all of the kinks, while with a technology that is itself adversarial, iteration might not be available (get it right the first time we deploy powerful AI).
Sure, those two. I don’t know about Ng (he recently had a private discussion with Hinton, but I don’t know what he thinks now), but I know LeCun hasn’t really engaged with the ideas and just relies on the concept that “it’s an extreme idea”. But as I said, having the position “AI doesn’t pose an existential threat” seems to be fringe nowadays.
If I dumb the argument down enough I get stuff like “intelligence/cognition/optimization is dangerous, and, whatever the reasons, we currently have zero reliable ideas on how to make a powerful general intelligence safe (eg. RLHF doesn’t work well enough as GPT-4 still lies/hallucinates and is jailbroken way too easily)” which is evidence based, not weird and not extreme.