Another type of intelligence explosion
I’ve argued that we might have to worry about dangerous non-general intelligences. In a series of back and forth with Wei Dai, we agreed that some level of general intelligence (such as that humans seem to possess) seemed to be a great advantage, though possibly one with diminishing returns. Therefore a dangerous AI could be one with great narrow intelligence in one area, and a little bit of general intelligence in others.
The traditional view of an intelligence explosion is that of an AI that knows how to do X, suddenly getting (much) better at doing X, to a level beyond human capacity. Call this the gain of aptitude intelligence explosion. We can prepare for that, maybe, by tracking the AI’s ability level and seeing if it shoots up.
But the example above hints at another kind of potentially dangerous intelligence explosion. That of a very intelligent but narrow AI that suddenly gains intelligence across other domains. Call this the gain of function intelligence explosion. If we’re not looking specifically for it, it may not trigger any warnings—the AI might still be dumber than the average human in other domains. But this might be enough, when combined with its narrow superintelligence, to make it deadly. We can’t ignore the toaster that starts babbling.
Isn’t this kind of thing a subset of the design space of minds post? Like, we don’t know exactly what kind of intelligence could end up exploding and there are lots of different possible variations?
Maybe a form of unit testing could be useful? Create a simple and not so simple test for a range of domains and get all AI’s to run them periodically.
By default the narrow AI’s would fail even the simple tests in other domains, but we would be able to monitor if / as it learns other domains.
Another test could be to see if its performance in its select field suddenly jumps up in effectiveness. To give a real world example, when Google (which is the closest thing we have to an AI right now, I think) gained the ability to suggest terms based on what one has already typed, it became much easier to search for things. Or when it will eventually gain the ability to parse human language, or so on.
you seem to be saying almost the same thing as in your other post.
The largest part of the threat from a general AI is the idea that it wouldn’t just persue a goal, it would understand enough about the world to protect it’s own persuit of that goal.
A paperclipper which litterally has no concept of things like gravity, minds, it’s own hardware and existance or living beings and has no capacity to understand them nor drive to expand might follow instructions too literally but it’s about as threatening as a roomba-dust-collecting-AI which figures out it can maximise dust picked up by dumping it’s bag and re-collecting it.
A general AI is only a threat because it’s a potential Super-Machiavellian genius which defends the first goals you program into it to stop you changing them.
We basically already have thin AIs. Just because something can translate a thousand languages doesn’t mean it will suddenly learn to build nuclear weapons in order to take over the world in order to maximise pages translated per hour.
A thin AI is like a blind, obsessive golem idiot savant with severe autism.
Some people might get hurt but in the same way that a bulldozer with a brick on the accelerator might hurt people. it’s a screwup, not a species ending event.
It’s a consequence of that other post’s idea, yes.
General intelligences are more threatening, but I don’t think we can safely dismiss narrower ones in certain positions (eg the drug designing AI in this post http://lesswrong.com/lw/kte/an_example_of_deadly_nongeneral_ai/ ).
In that example you propose someone giving a thin AI with a very general goal which would require a lot of general intelligence to even understand.
If you have an AI which understands biochemistry you’d give it a goal like “design me a protein which binds to this molecule” not “maximize goodness and minimize badness”
The only way what you’re proposing would work would be for it to be a general AI with merely human level abilities in most areas combined with a small number of areas of extreme expertise. that is not a thin AI or a non-general AI.
It seems the general goal could be cashed out in simple ways, with biochemistry, epidemeology, and a (potentially flawed) measure of “health”.
I think you’re sneaking in a lot with the measure of health. As far as I can see, the only reason its dangerous is because it caches out in the real world, on the real broad population rather than a simulation. Having the AI reason about a drugs effects on a real world population definitely seems like a general skill, not a narrow skill.
The question is how difficult it is to jump from the stupid AI to the general AI. Does it require hundred gradual improvements? Or could just one right improvement in the right situation jump across the whole abyss? Something like taking the “idiot savant golem with severe autism” who cares only about one specific goal, and replacing the goal with “understand everything, and apply this understanding to improving your own functionality”… and suddenly we have the fully general AI.
Remember that compartmentalization exists in human minds, but the world is governed by universal laws. In some sense, “understanding particles” is all you need. And of course some techniques to overcome computational costs, such as creating and using higher-level models. -- With the higher-level models, compartmentalization can return, but maybe it would be different for a mind that could work not just within these models, but also to create and modify them as necessary, as opposed to the human mind, which has some of those levels hardwired and the other ones always feel a bit “strange”.
Being good at translating thousand languages is not scary. Being good at modelling thousand situations, probably yes.
How would you measure aptitude gain?
There are suggestions, such as using some computable version of the measure AIXI is maximising. Kaj Sotala has a review of methods, unpublished currently I believe.