Unfriendly AGI, despite its potential power, may well have huge blind spots; mind design space is big!
You may be right, but I don’t think it’s a very fruitful idea: what exactly do you propose doing? Also, building of a FAI is a distinct effort from e.g. healing malaria or fighting specific killer robots (with the latter being quite hypothetical, while at least the question of technically understanding FAI seems inevitable).
This may be possible if an AGI has a combination of two features: it has significant real-world capabilities that make it dangerous, yet it’s insane or incapable enough to not be able to do AGI design. I don’t think it’s very plausible, since (1) even Nature was able to build us, given enough resources, and it has no mind at all, so it shouldn’t be fundamentally difficult to build an AGI (even for an irrational proto-AGI) and (2) we are at the lower threshold of being dangerous to ourselves, yet it seems we are at the brink of building an AGI already. Having an AGI dangerous (extinction risk dangerous), and dangerous exactly because of its intelligence, yet not AGI-building-capable doesn’t seem to me unlikely. But maybe possible for some time.
Now, consider the argument about humans being at the lowest possible cognitive capability to do much of anything, applied to proto-AGI-designed AGIs. AGI-designed AGIs are unlikely to be exactly as dangerous as the designer AGI, they are more likely to be significantly more or less dangerous, with “less dangerous” not being an interesting category, if both kinds of designs occur over time. This expected danger adds to the danger of the original AGI, however inapt they themselves may be. And at some point, you get to an FAI-theory-capable-AGI that builds something rational, not once failing all the way to the end of times.
I’d like to continue this conversation, but we’re both going to have to be more verbose. Both of us are speaking in very compressed allusive (that is, allusion-heavy) style, and the potential for miscommunication is high.
“I don’t think it’s a very fruitful idea: what exactly do you propose doing?”
My notion is that SIAI in general and EY in particular, typically work with a specific “default future”—a world where, due to Moore’s law and the advance of technology generally, the difficulty of building a “general-purpose” intelligent computer program drops lower and lower, until one is accidentally or misguidedly created, and the world is destroyed in a span of weeks. I understand that the default future here is intended to be a conservative worst-case possibility, and not a most-probable scenario.
However, this scenario ignores the number and power of entities (such as corporations, human-computer teams, and special-purpose computer programs) and which are more intelligent in specific domains than humans. It ignores their danger—human potential flourishing can be harmed by other things than pure software—and it ignores their potential as tools against unFriendly superintelligence.
Correcting that default future to something more realistic seems fruitful enough to me.
“Technically understanding FAI seems inevitable.”
What? I don’t understand this claim at all. Friendly artificial intelligence, as a theory, need not necessarily be developed before the world is destroyed or significantly harmed.
“This may be possible” What is the referent of “this”? Techniques for combating, constraining, controlling, or manipulating unFriendly superintelligence? We already have these techniques. We harness all kinds of things which are not inherently Friendly and turn them to our purposes (rivers, nations, bacterial colonies). Techniques of building Friendly entities will grow directly out of our existing techniques of taming and harnessing the world, including but not limited to our techniques of proving computer programs correct.
I am not sure I understand your argument in detail, but from what I can tell, your argument is focused “internal” to the aforementioned default future. My focus is on the fact that many very smart AI researchers are dubious about this default future, and on trying to update and incorporate that information.
However, this scenario ignores the number and power of entities (such as corporations, human-computer teams, and special-purpose computer programs) and which are more intelligent in specific domains than humans.
Good point. Even an Einstein level AI with 100 times the computing power of an average human brain probably wouldn’t be able to be beat Deep Blue at chess (at least not easily).
You may be right, but I don’t think it’s a very fruitful idea: what exactly do you propose doing?
Sending Summer Glau back in time, obviously. If you find an unfriendly AGI that can’t figure out how to build nanotech or teraform the planet we’re saved.
You may be right, but I don’t think it’s a very fruitful idea: what exactly do you propose doing? Also, building of a FAI is a distinct effort from e.g. healing malaria or fighting specific killer robots (with the latter being quite hypothetical, while at least the question of technically understanding FAI seems inevitable).
This may be possible if an AGI has a combination of two features: it has significant real-world capabilities that make it dangerous, yet it’s insane or incapable enough to not be able to do AGI design. I don’t think it’s very plausible, since (1) even Nature was able to build us, given enough resources, and it has no mind at all, so it shouldn’t be fundamentally difficult to build an AGI (even for an irrational proto-AGI) and (2) we are at the lower threshold of being dangerous to ourselves, yet it seems we are at the brink of building an AGI already. Having an AGI dangerous (extinction risk dangerous), and dangerous exactly because of its intelligence, yet not AGI-building-capable doesn’t seem to me unlikely. But maybe possible for some time.
Now, consider the argument about humans being at the lowest possible cognitive capability to do much of anything, applied to proto-AGI-designed AGIs. AGI-designed AGIs are unlikely to be exactly as dangerous as the designer AGI, they are more likely to be significantly more or less dangerous, with “less dangerous” not being an interesting category, if both kinds of designs occur over time. This expected danger adds to the danger of the original AGI, however inapt they themselves may be. And at some point, you get to an FAI-theory-capable-AGI that builds something rational, not once failing all the way to the end of times.
I’d like to continue this conversation, but we’re both going to have to be more verbose. Both of us are speaking in very compressed allusive (that is, allusion-heavy) style, and the potential for miscommunication is high.
“I don’t think it’s a very fruitful idea: what exactly do you propose doing?” My notion is that SIAI in general and EY in particular, typically work with a specific “default future”—a world where, due to Moore’s law and the advance of technology generally, the difficulty of building a “general-purpose” intelligent computer program drops lower and lower, until one is accidentally or misguidedly created, and the world is destroyed in a span of weeks. I understand that the default future here is intended to be a conservative worst-case possibility, and not a most-probable scenario.
However, this scenario ignores the number and power of entities (such as corporations, human-computer teams, and special-purpose computer programs) and which are more intelligent in specific domains than humans. It ignores their danger—human potential flourishing can be harmed by other things than pure software—and it ignores their potential as tools against unFriendly superintelligence.
Correcting that default future to something more realistic seems fruitful enough to me.
“Technically understanding FAI seems inevitable.” What? I don’t understand this claim at all. Friendly artificial intelligence, as a theory, need not necessarily be developed before the world is destroyed or significantly harmed.
“This may be possible” What is the referent of “this”? Techniques for combating, constraining, controlling, or manipulating unFriendly superintelligence? We already have these techniques. We harness all kinds of things which are not inherently Friendly and turn them to our purposes (rivers, nations, bacterial colonies). Techniques of building Friendly entities will grow directly out of our existing techniques of taming and harnessing the world, including but not limited to our techniques of proving computer programs correct.
I am not sure I understand your argument in detail, but from what I can tell, your argument is focused “internal” to the aforementioned default future. My focus is on the fact that many very smart AI researchers are dubious about this default future, and on trying to update and incorporate that information.
Good point. Even an Einstein level AI with 100 times the computing power of an average human brain probably wouldn’t be able to be beat Deep Blue at chess (at least not easily).
Sending Summer Glau back in time, obviously. If you find an unfriendly AGI that can’t figure out how to build nanotech or teraform the planet we’re saved.