Not exactly, Thom. Roughly, for FAI you need precise self-modification. For precise self-modification, you need a precise theory of the intelligence doing the self-modification. To get to FAI you have to walk the road that leads to precise theories of intelligence—something like our present-day probability theory and decision theory, but more powerful and general and addressing issues these present theories don’t.
Eurisko is the road of self-modification done in an imprecise way, ad-hoc, throwing together whatever works until it gets smart enough to FOOM. This is a path that leads to shattered planets, if it were followed far enough. No, I’m not saying that Eurisko in particular is far enough, I’m saying that it’s a first step along that path, not the FAI path.
Perhaps a writeup of what you have discovered, or at least surmise, about walking that road would encourage bright young minds to work on those puzzles instead of reimplementing Eurisko.
It’s not immediately clear that studying and playing with specific toy self-referential systems won’t lead to ideas that might apply to precise members of that class.
I’ve written up some of the concepts of precise self-modification, but need to collect the posts on a Wiki page on “lawfulness of intelligence” or something.
Well, in this sense computing is also a first step on that path, Moore’s law of mad science and all. Eurisko in particular doesn’t seem to deserve more mention than that.
...indeed. It seems that I failed to figure out just what I was arguing against. Let me re-make that point.
As far as first steps along that path go, they have already been taken: we have gone from a world without computers to a world with one, and we can’t reverse that. The logical place to focus our efforts would seem to be the next step which has not been taken, which could very well be reimplementing EURISKO. (Though it could also very well be running a neural net on a supercomputer or some guy making the video game “Operant Conditioning Hero”.)
We have gone from a world without dictators to a world with one, and we can’t reverse that. The logical place to focus our efforts would seem to be the next step which has not been taken, which could very well be resurrecting Hitler.
While grandparent was probably a miscalculation of some sort, I feel that mentioning Hitler is more acceptable if the context is Nazi super science than outrage maximization.
Grandparent was probably a miscalculation of some sort, but I think mention of Hitler is acceptable if the context is Nazi super science rather than outrage maximization.
Resurrecting Hitler would probably teach us a lot about medicine, actually. If we can generalize the process by which we resurrect Hitler, we could save a lot of lives.
True, if resurrecting Hitler is a good idea and we can cause it to happen; if resurrecting Hitler is inevitable and we can ensure that he ends up being a good guy; or if resurrecting Hitler would be bad and we can prevent it from happening.
Do you suppose that developing a FAI will require at least some experience trying whatever works? I don’t know of any major computer programs that were written entirely before they were first compiled...
Edit: I see SoullessAutomaton has written a very similar comment.
But can we learn anything useful for a complete theory of intelligence based on something like EURISKO? Sure, it’s an ad hoc, throw things at the wall and see what sticks approach—but so are our brains, and if something like EURISKO can show limited, non-foomy levels of optimization power it would at least provide another crappy data point other than vertebrate brains on how intelligence works.
I used to think it’s useful to study ad-hoc attempts at AGI, but it now seems to me that knowledge of these chaotic things is both very likely a dead end, even for destroying the world, and of a wrong character to progress towards FAI.
I think one of the factors that contributes to interest in ad-hoc techniques is the prospect of a “thrilling discovery”. One is allowed to fantasize that all of their time and effort may pay off suddenly and unpredictably, which makes the research seem that much more fun and exciting. This is in contrast to a more formal approach in which understanding and progress are incremental by their very nature.
I bring this up because I see it as a likely underlying motive for arguments of the form “ad-hoc technique X is worth pursuing even though it’s not a formal approach”.
There are two kinds of scientific progress: the methodical experimentation and categorization which gradually extend the boundaries of knowledge, and the revolutionary leap of genius which redefines and transcends those boundaries. Acknowledging our debt to the former, we yearn nonetheless for the latter. - Academician Prokhor Zakharov, “Address to the Faculty”
No, it actually looks (just barely) feasible to get a FOOM out of something ad-hoc, and there are even good reasons for expecting that. But it doesn’t seem to be on the way towards deeper understanding. The best to hope for is catching it right when the FOOMing is imminent and starting to do serious theory, but the path of blind experimentation doesn’t seem to be the most optimal one even towards blind FOOM.
I didn’t mean to imply it was an invalid motive, merely a potential underlying motive. If it is valid in the sense that you mean (and I think it is), that’s just reason to scrutinize such claims even more closely.
Starting to seriously think about FAI and studying more rigorous system modeling techniques/theories changed my mind. There seems to be very little overlap between wild intuitions of ad-hoc AGI and technical challenges of careful inference/simulation or philosophical issues with formalizing decision theories for intelligence on overdrive.
Some of the intuitions from thinking about ad-hoc seem to carry over, but it’s just that: intuitions, and understanding of approaches to more careful modeling, even if they are applicable only on “toy” applications, gives deeper insight than knowledge of a dozen “real projects”. Intuitions gained from ad-hoc do apply, but only as naive clumsy caricatures.
Ad hoc AI is like ad hoc aircraft design. It flaps, it’s got wings, it has to fly, right? If we keep trying stuff, we’ll stumble across a wing that works. Maybe it’s the feathers?
Since such aircraft design actually worked, and produced aeroplanes before pure theory-based design, perhaps it’s not the best analogy. [Edit: Unless that was your point]
There are multiple concepts in the potential of ad-hoc. There is Strong AI, Good AI (Strong AI that has a positive effect), and Useful AI (Strong AI that can be used as a prototype or inspiration for Good AI, but can go Paperclip maximizer if allowed to grow). These concepts can be believed to be in quite different relations to each other.
Your irony states that there is no potential for any Strong AI in ad-hoc. Given that stupid evolution managed to get there, I think that with enough brute force of technology it’s quite feasible to get to Strong AI via this road.
Many reckless people working on AGI think that Strong AI is likely to also be a Good AI.
My previous position was that ad-hoc gives a good chance (in the near future) for Strong AI that is likely a Useful AI, but unlikely a Good AI. My current position is that ad-hoc has a small but decent chance (in the near future) for Strong AI, that is unlikely to be either Useful AI or Good AI.
Good AI is a category containing Friendly AI, that doesn’t require the outcome to be precisely right. This separates more elaborated concept of Friendly AI from an informal concept (requirement) of good outcome.
I believe the concepts are much more close than it seems, that is it’s hard to construct an AI that is not precisely Friendly, but still Good.
FAI is about being reliably harmless. Whether the outcome seems good in the short term is tangential. Even a “good” AI ought to be considered unfriendly if it’s opaque to proof—what can you possibly rely upon? No amount of demonstrated good behavior can be trusted. It could be insincere, it could be sincere but fatally misguided, it could have a flaw that will distort its goals after a few recursions. We would be stupid to just run it and see.
That’s what I had in mind, though I didn’t state it explicitly. It’s what I meant by ‘worked out’. It’s clear that you want these things worked out formally, as strong as being provably friendly.
I’m still skeptical on the world-destroying. My money’s on chaos to FOOM. Dynamism FTW. But then, I think AGI will come from robots.
Yeah, he means that. “Please don’t work on AGI until you’ve worked out FAI.”
ETA: read Eliezer’s reply.
Not exactly, Thom. Roughly, for FAI you need precise self-modification. For precise self-modification, you need a precise theory of the intelligence doing the self-modification. To get to FAI you have to walk the road that leads to precise theories of intelligence—something like our present-day probability theory and decision theory, but more powerful and general and addressing issues these present theories don’t.
Eurisko is the road of self-modification done in an imprecise way, ad-hoc, throwing together whatever works until it gets smart enough to FOOM. This is a path that leads to shattered planets, if it were followed far enough. No, I’m not saying that Eurisko in particular is far enough, I’m saying that it’s a first step along that path, not the FAI path.
Perhaps a writeup of what you have discovered, or at least surmise, about walking that road would encourage bright young minds to work on those puzzles instead of reimplementing Eurisko.
It’s not immediately clear that studying and playing with specific toy self-referential systems won’t lead to ideas that might apply to precise members of that class.
I’ve written up some of the concepts of precise self-modification, but need to collect the posts on a Wiki page on “lawfulness of intelligence” or something.
Any of these posts ever go up?
Cf. “Lawful intelligence.”
Well, in this sense computing is also a first step on that path, Moore’s law of mad science and all. Eurisko in particular doesn’t seem to deserve more mention than that.
Doesn’t seem to deserve more mention than the creation of computing? Sure. But computing has already been created.
Um, so has Eurisko.
...indeed. It seems that I failed to figure out just what I was arguing against. Let me re-make that point.
As far as first steps along that path go, they have already been taken: we have gone from a world without computers to a world with one, and we can’t reverse that. The logical place to focus our efforts would seem to be the next step which has not been taken, which could very well be reimplementing EURISKO. (Though it could also very well be running a neural net on a supercomputer or some guy making the video game “Operant Conditioning Hero”.)
We have gone from a world without dictators to a world with one, and we can’t reverse that. The logical place to focus our efforts would seem to be the next step which has not been taken, which could very well be resurrecting Hitler.
Seriously? I did not think a discussion of Eurisko could be Godwinned. Bravo.
While grandparent was probably a miscalculation of some sort, I feel that mentioning Hitler is more acceptable if the context is Nazi super science than outrage maximization.
Grandparent was probably a miscalculation of some sort, but I think mention of Hitler is acceptable if the context is Nazi super science rather than outrage maximization.
Resurrecting Hitler would probably teach us a lot about medicine, actually. If we can generalize the process by which we resurrect Hitler, we could save a lot of lives.
True, if resurrecting Hitler is a good idea and we can cause it to happen; if resurrecting Hitler is inevitable and we can ensure that he ends up being a good guy; or if resurrecting Hitler would be bad and we can prevent it from happening.
Do you suppose that developing a FAI will require at least some experience trying whatever works? I don’t know of any major computer programs that were written entirely before they were first compiled...
Edit: I see SoullessAutomaton has written a very similar comment.
But can we learn anything useful for a complete theory of intelligence based on something like EURISKO? Sure, it’s an ad hoc, throw things at the wall and see what sticks approach—but so are our brains, and if something like EURISKO can show limited, non-foomy levels of optimization power it would at least provide another crappy data point other than vertebrate brains on how intelligence works.
I used to think it’s useful to study ad-hoc attempts at AGI, but it now seems to me that knowledge of these chaotic things is both very likely a dead end, even for destroying the world, and of a wrong character to progress towards FAI.
I think one of the factors that contributes to interest in ad-hoc techniques is the prospect of a “thrilling discovery”. One is allowed to fantasize that all of their time and effort may pay off suddenly and unpredictably, which makes the research seem that much more fun and exciting. This is in contrast to a more formal approach in which understanding and progress are incremental by their very nature.
I bring this up because I see it as a likely underlying motive for arguments of the form “ad-hoc technique X is worth pursuing even though it’s not a formal approach”.
Upvoted for alpha centuri reference. God I love that game!
No, it actually looks (just barely) feasible to get a FOOM out of something ad-hoc, and there are even good reasons for expecting that. But it doesn’t seem to be on the way towards deeper understanding. The best to hope for is catching it right when the FOOMing is imminent and starting to do serious theory, but the path of blind experimentation doesn’t seem to be the most optimal one even towards blind FOOM.
That doesn’t contradict what logi said. It could still be a motive.
It could, but it won’t be invalid motive, as I (maybe incorrectly) heard implied.
I didn’t mean to imply it was an invalid motive, merely a potential underlying motive. If it is valid in the sense that you mean (and I think it is), that’s just reason to scrutinize such claims even more closely.
What changed your mind?
Starting to seriously think about FAI and studying more rigorous system modeling techniques/theories changed my mind. There seems to be very little overlap between wild intuitions of ad-hoc AGI and technical challenges of careful inference/simulation or philosophical issues with formalizing decision theories for intelligence on overdrive.
Some of the intuitions from thinking about ad-hoc seem to carry over, but it’s just that: intuitions, and understanding of approaches to more careful modeling, even if they are applicable only on “toy” applications, gives deeper insight than knowledge of a dozen “real projects”. Intuitions gained from ad-hoc do apply, but only as naive clumsy caricatures.
Ad hoc AI is like ad hoc aircraft design. It flaps, it’s got wings, it has to fly, right? If we keep trying stuff, we’ll stumble across a wing that works. Maybe it’s the feathers?
Since such aircraft design actually worked, and produced aeroplanes before pure theory-based design, perhaps it’s not the best analogy. [Edit: Unless that was your point]
There are multiple concepts in the potential of ad-hoc. There is Strong AI, Good AI (Strong AI that has a positive effect), and Useful AI (Strong AI that can be used as a prototype or inspiration for Good AI, but can go Paperclip maximizer if allowed to grow). These concepts can be believed to be in quite different relations to each other.
Your irony states that there is no potential for any Strong AI in ad-hoc. Given that stupid evolution managed to get there, I think that with enough brute force of technology it’s quite feasible to get to Strong AI via this road.
Many reckless people working on AGI think that Strong AI is likely to also be a Good AI.
My previous position was that ad-hoc gives a good chance (in the near future) for Strong AI that is likely a Useful AI, but unlikely a Good AI. My current position is that ad-hoc has a small but decent chance (in the near future) for Strong AI, that is unlikely to be either Useful AI or Good AI.
BTW, none of the above classifications are “friendly”.
Good AI is a category containing Friendly AI, that doesn’t require the outcome to be precisely right. This separates more elaborated concept of Friendly AI from an informal concept (requirement) of good outcome.
I believe the concepts are much more close than it seems, that is it’s hard to construct an AI that is not precisely Friendly, but still Good.
FAI is about being reliably harmless. Whether the outcome seems good in the short term is tangential. Even a “good” AI ought to be considered unfriendly if it’s opaque to proof—what can you possibly rely upon? No amount of demonstrated good behavior can be trusted. It could be insincere, it could be sincere but fatally misguided, it could have a flaw that will distort its goals after a few recursions. We would be stupid to just run it and see.
At which point you are starting to think of what it takes to make not just informally “Good” AI, but an actually Friendly AI.
Right.
That’s what I had in mind, though I didn’t state it explicitly. It’s what I meant by ‘worked out’. It’s clear that you want these things worked out formally, as strong as being provably friendly.
I’m still skeptical on the world-destroying. My money’s on chaos to FOOM. Dynamism FTW. But then, I think AGI will come from robots.