“And I heard a voice saying ‘Give up! Give up!’ And that really scared me ’cause it sounded like Ben Kenobi.” (source)
Friendly AI is a humongous damn multi-genius-decade sized problem. The first step is to realize this, and the second step is to find some fellow geniuses and spend a decade or two solving it. If you’re looking for a quick fix you’re out of luck.
The same (albeit to a lesser degree) is fortunately also true of Artificial General Intelligence in general, which is why the hordes of would-be meddling dabblers haven’t killed us all already.
This article (which I happened across today) written by Ben Goertzel should make interesting reading for a would-be AI maker. It details Ben’s experience trying to build an AGI during the dot-com bubble. His startup company, Webmind, Inc., apparently had up to 130 (!) employees at its peak.
According to the article, the AGI was almost completed, and the main reason his effort failed was that the company ran out of money due to the bursting of the bubble. Together with the anthropic principle, this seems to imply that Ben is the person responsible for the stock market crash of 2000.
I was always puzzled why SIAI hired Ben Goertzel to be its research director, and this article only deepens the mystery. If Ben has done an Eliezer-style mind-change since writing that article, I think I’ve missed it.
ETA: Apparently Ben has recently been helping his friend Hugo de Garis build an AI at Xiamen University under a grant from the Chinese government. How do you convince someone to give up building an AGI when your own research director is essentially helping the Chinese government build one?
I was always puzzled why SIAI hired Ben Goertzel to be its research director, and this article only deepens the mystery.
Ben has a Phd, can program, has written books on the subject and has some credibility. Those kinds of things can help a little if you are trying to get people to give you money in the hope of you building a superintelligent machine. For more see here:
It has similarly been a general rule with the Singularity Institute that, whatever it is we’re supposed to do to be more credible, when we actually do it, nothing much changes. “Do you do any sort of code development? I’m not interested in supporting an organization that doesn’t develop code” → OpenCog → nothing changes. “Eliezer Yudkowsky lacks academic credentials” → Professor Ben Goertzel installed as Director of Research → nothing changes. The one thing that actually has seemed to raise credibility, is famous people associating with the organization, like Peter Thiel funding us, or Ray Kurzweil on the Board.
I just came across an old post of mine that asked a similar question:
BTW, I still remember the arguments between Eliezer and Ben about
Friendliness and Novamente. As late as January 2005, Eliezer wrote:
And if Novamente should ever cross the finish line, we all die. That is what I believe or I would be working for Ben this instant.
I’m curious how that debate was resolved?
From the reluctance of anyone at SIAI to answer this question, I conclude that Ben Goertzel being the Director of Research probably represents the outcome of some internal power struggle/compromise at SIAI, whose terms of resolution included the details of the conflict being kept secret.
What is the right thing to do here? Should we try to force an answer out of SIAI, for example by publicly accusing it of not taking existential risk seriously? That would almost certainly hurt SIAI as a whole, but might strengthen “our” side of this conflict. Does anyone have other suggestions for how to push SIAI in a direction that we would prefer?
Have you updated that in light of the fact that Ben just convinced the Chinese government to start funding AGI? (See my article link earlier in this thread.)
Update for anyone that comes across this comment: Ben Goertzel recently tweeted that he will be taking over Hugo de Garis’s lab, pending paperwork approval.
Do you believe the given answer? And if Ben is really that impotent, what do you think does it reveal about the SIAI, or whoever put Ben into a position within the SIAI?
I don’t know enough about his capabilities when it comes to contributing to unfriendly AI research to answer that. Being unable to think sanely about friendliness or risks may have little bearing on your capabilities with respect to AGI research. The modes of thinking have very little bearing on each other.
And if Ben is really that impotent, what do you think does it reveal about the SIAI, or whoever put Ben into a position within the SIAI?
That they may be more rational and less idealistic than I may otherwise have guessed. There are many potential benefits the SIAI could gain from an affiliation with those inside the higher status AGI communities. Knowing who to know has many uses unrelated to knowing what to know.
That they may be more rational and less idealistic than I may otherwise have guessed. There are many potential benefits the SIAI could gain from an affiliation with those inside the higher status AGI communities. Knowing who to know has many uses unrelated to knowing what to know.
Indeed. I read part of this post as implying that his position had at least a little bit to do with gaining status from affiliating with him (“It has similarly been a general rule with the Singularity Institute that, whatever it is we’re supposed to do to be more credible, when we actually do it, nothing much changes. ‘Do you do any sort of code development? I’m not interested in supporting an organization that doesn’t develop code’ → OpenCog → nothing changes. ‘Eliezer Yudkowsky lacks academic credentials’ → Professor Ben Goertzel installed as Director of Research → nothing changes.”).
Indeed. I read this post as implying that his position had at least a little bit to do with gaining status from affiliating with him (“It has similarly been a general rule with the Singularity Institute that, whatever it is we’re supposed to do to be more credible, when we actually do it, nothing much changes
That’s an impressive achievement! I wonder if they will be able to maintain it? I also wonder whether they will be able to distinguish those times when the objections are solid, not merely something to treat as PR concerns. There is a delicate balance to be found.
There are many potential benefits the SIAI could gain from an affiliation with those inside the higher status AGI communities. Knowing who to know has many uses unrelated to knowing what to know.
Does this suggest that founding a stealth AGI institute (to coordinate conferences, and communication between researchers) might be suited to oversee and influence potential undertakings that could lead to imminent high-risk situations?
By the way, I noticed from my server logs that the Institute for Defense Analyses seems to be reading LW. They visited my homepage, referred by my LW profile. So one should think about the consequences of discussing such matters in public, respectively not doing so.
By the way, I noticed from my server logs that the Institute for Defense Analyses seems to be reading LW. They visited my homepage, referred by my LW profile. So one should think about the consequences of discussing such matters in public, respectively not doing so.
There is one ‘mostly harmless’ for people who you think will fail at AGI. There is an entirely different ‘mostly harmless’ for actually have a research director who tries to make AIs that could kill us all. Why would I not think the SIAI is itself an existential risk if the criteria for director recruitment is so lax? Being absolutely terrified of disaster is the kind of thing that helps ensure appropriate mechanisms to prevent defection are kept in place.
What is the right thing to do here? Should we try to force an answer out of SIAI, for example by publicly accusing it of not taking existential risk seriously?
Yes. The SIAI has to convince us that they are mostly harmless.
According to the article, the AGI was almost completed, and the main reason his effort failed was that the company ran out of money due to the bursting of the bubble. Together with the anthropic principle, this seems to imply that Ben is the person responsible for the stock market crash of 2000.
Phew...I was almost going to call bullshit on this but that would be impolite.
which is why the hordes of would-be meddling dabblers haven’t killed us all already.
I wonder to what extent we’ve been “saved” so far by anthropics. Okay, that’s probably not the dominant effect. I mean, yeah, it’s quite clear that AI is, as you note, REALLY hard.
But still, I can’t help but wonder just how little or much that’s there.
If you think anthropics has saved us from AI many times, you ought to believe we will likely die soon, because anthropics doesn’t constrain the future, only the past. Each passing year without catastrophe should weaken your faith in the anthropic explanation.
The first sentence seems obviously true to me, the second probably false.
My reasoning: to make observations and update on them, I must continue to exist. Hence I expect to make the same observations & updates whether or not the anthropic explanation is true (because I won’t exist to observe and update on AI extinction if it occurs), so observing a “passing year without catastrophe” actually has a likelihood ratio of one, and is not Bayesian evidence for or against the anthropic explanation.
Right. My point was in the future you are still going to say “wow the world hasn’t been destroyed yet” even if in 99% of alternate realities it was. cousn_it said:
Each passing year without catastrophe should weaken your faith in the anthropic explanation.
Which shouldn’t be true at all.
If you can not observe a catastrophe happen, then not observing a catastrophe is not evidence for any hypothesis.
“Not observing a catastrophe” != “observing a non-catastrophe”. If I’m playing russian roulette and I hear a click and survive, I see good reason to take that as extremely strong evidence that there was no bullet in the chamber.
But doesn’t the anthropic argument still apply? Worlds where you survive playing russian roulette are going to be ones where there wasn’t a bullet in the chamber. You should expect to hear a click when you pull the trigger.
Those universes where you die still exist, even if you don’t observe them. If you carry your logic to its conclusion, there would be no risk to playing russian roulette, which is absurd.
The standard excuse given by those who pretend to believe in many worlds is that you are likely to get maimed in the universes where you get shot but don’t die, which is somewhat unpleasant. If you come up with a more reliable way to quantum suicide, like using a nuke, they find another excuse.
Methinks that is still a lack of understanding, or a disagreement on utility calculations. I myself would rate the universes where I die as lower utility still than those were I get injured (indeed the lowest possible utility).
I do think ‘a disagreement on utility calculations’ may indeed be a big part of it. Are you a total utilitarian? I’m not. A big part of that comes from the fact that I don’t consider two copies of myself to be intrinsically more valuable than one—perhaps instrumentally valuable, if those copies can interact, sync their experiences and cooperate, but that’s another matter. With experience-syncing, I am mostly indifferent to the number of copies of myself to exist (leaving aside potential instrumental benefits), but without it I evaluate decreasing utility as the number of copies increases, as I assign zero terminal value to multiplicity but positive terminal value to the uniqueness of my identity.
My brand of utilitarianism is informed substantially by these preferences. I adhere to neither average nor total utilitarianism, but I lean closer to average. Whilst I would be against the use of force to turn a population of 10 with X utility each into a population of 3 with (X + 1) utility each, I would in isolation consider the latter preferable to the former (there is no inconsistency here—my utility function simply admits information about the past).
I’m saying that you can only observe not dying. Not that you shouldn’t care about universes that you don’t exist in or observe.
The risk in Russian roulette is, in the worlds where you do survive you will probably be lobotomized, or drop the gun shooting someone else, etc. Ignoring that, there is no risk. As long as you don’t care about universes where you die.
How is that relevant? If I take some action that results in the death of myself in some other Everett branch, then I have killed a human being in the multiverse.
Think about applying your argument to this universe. You shoot someone in the head, they die instantly, and then you say to the judge “well think of it this way: he’s not around to experience this. besides, there’s other worlds where I didn’t shoot him, so he’s not really dead!”
You can’t appeal to common sense. That’s the point of quantum immortality, it defies our common sense notions about death. Obviously, since we are used to assuming single-threaded universe, where death is equivalent to ceasing to exist.
Of course, if you kill someone, you still cause that person pain in the vast majority of universes, as well as grieving to their family and friends.
If star-trek-style teleportation was possible by creating a clone and deleting the original, is that equivalent to suicide/murder/death? If you could upload your mind to a computer but destroy your biological brain, is that suicide, and is the upload really you? Does destroying copies really matter as long as one lives on (assuming the copies don’t suffer)?
You absolutely appeal to common sense on moral issues. Morality is applied common sense, in the Minsky view of “common sense” being an assortment of deductions and inferences extracted from the tangled web of my personal experiential and computational history. Morality is the result of applying that common sense knowledgebase against possible actions in a planning algorithm.
Quantum “immortality” involves a sudden, unexpected, and unjustified redefinition of “death.” That argument works if you buy the premise. But, I don’t.
If you are saying that there is no difference between painlessly, instantaneously killing someone in one branch while letting them live another, verses letting that person live in both, then I don’t know how to proceed. If you’re going to say that then you might as well make yourself indifferent to the arrow of time as well, in which case it doesn’t matter if that person dies in all branches because he still “exists” in history.
Now I no longer know what we are talking about. According to my morality, it is wrong to kill someone. The existence of other branches where that person does not die does not have even epsilon difference on my evaluation of moral choices in this world. The argument from the other side seems inconsistent to me.
And yes, star trek transporters and destructive uploaders are death machines, a position I’ve previouslyarticulated on lesswrong.
You are appealing to a terminal value that I do not share. I think caring about clones is absurd. As long as one copy of me lives, what difference does it make if I create and delete a thousand others? It doesn’t change my experience or theirs. Nothing would change and I wouldn’t even be aware of it.
From my point of view, I do not like the thought that I might be arbitrarily deleted by a clone of myself. I therefore choose to commit to not deleting clones of myself; thus preventing myself from being deleted by any clones that share that commitment.
If you can not observe a catastrophe happen, then not observing a catastrophe is not evidence for any hypothesis.
I don’t think this is quite true (it can redistribute probability between some hypotheses). But this strengthens your position rather than weakening it.
Retracted: Not correct. What was I thinking? Just because you don’t observe the universes where the world was destroyed, doesn’t mean those universes don’t exist.
“And I heard a voice saying ‘Give up! Give up!’ And that really scared me ’cause it sounded like Ben Kenobi.” (source)
Friendly AI is a humongous damn multi-genius-decade sized problem. The first step is to realize this, and the second step is to find some fellow geniuses and spend a decade or two solving it. If you’re looking for a quick fix you’re out of luck.
The same (albeit to a lesser degree) is fortunately also true of Artificial General Intelligence in general, which is why the hordes of would-be meddling dabblers haven’t killed us all already.
This article (which I happened across today) written by Ben Goertzel should make interesting reading for a would-be AI maker. It details Ben’s experience trying to build an AGI during the dot-com bubble. His startup company, Webmind, Inc., apparently had up to 130 (!) employees at its peak.
According to the article, the AGI was almost completed, and the main reason his effort failed was that the company ran out of money due to the bursting of the bubble. Together with the anthropic principle, this seems to imply that Ben is the person responsible for the stock market crash of 2000.
I was always puzzled why SIAI hired Ben Goertzel to be its research director, and this article only deepens the mystery. If Ben has done an Eliezer-style mind-change since writing that article, I think I’ve missed it.
ETA: Apparently Ben has recently been helping his friend Hugo de Garis build an AI at Xiamen University under a grant from the Chinese government. How do you convince someone to give up building an AGI when your own research director is essentially helping the Chinese government build one?
Ben has a Phd, can program, has written books on the subject and has some credibility. Those kinds of things can help a little if you are trying to get people to give you money in the hope of you building a superintelligent machine. For more see here:
I just came across an old post of mine that asked a similar question:
From the reluctance of anyone at SIAI to answer this question, I conclude that Ben Goertzel being the Director of Research probably represents the outcome of some internal power struggle/compromise at SIAI, whose terms of resolution included the details of the conflict being kept secret.
What is the right thing to do here? Should we try to force an answer out of SIAI, for example by publicly accusing it of not taking existential risk seriously? That would almost certainly hurt SIAI as a whole, but might strengthen “our” side of this conflict. Does anyone have other suggestions for how to push SIAI in a direction that we would prefer?
The short answer is that Ben and I are both convinced the other is mostly harmless.
Have you updated that in light of the fact that Ben just convinced the Chinese government to start funding AGI? (See my article link earlier in this thread.)
Hugo de Garis is around two orders of magnitude more harmless than Ben.
Update for anyone that comes across this comment: Ben Goertzel recently tweeted that he will be taking over Hugo de Garis’s lab, pending paperwork approval.
http://twitter.com/bengoertzel/status/16646922609
http://twitter.com/bengoertzel/status/16647034503
What about all the other people Ben might help obtain funding for, partly due to his position at SIAI?
And what about the public relations/education aspect? It’s harmless that SIAI appears to not consider AI to be a serious existential risk?
This part was not answered. It may be a question to ask someone other than Eliezer. Or just ask really loudly. That sometimes works too.
The reverse seems far more likely.
I don’t know how to parse that. What do you mean by “the reverse”?
Ben’s position at SIAI may reduce the expected amount of funding he obtains for other existentially risky persons.
How much of this harmlessness is perceived impotence and how much is it an approximately sane way of thinking?
Wholly perceived impotence.
Do you believe the given answer? And if Ben is really that impotent, what do you think does it reveal about the SIAI, or whoever put Ben into a position within the SIAI?
I don’t know enough about his capabilities when it comes to contributing to unfriendly AI research to answer that. Being unable to think sanely about friendliness or risks may have little bearing on your capabilities with respect to AGI research. The modes of thinking have very little bearing on each other.
That they may be more rational and less idealistic than I may otherwise have guessed. There are many potential benefits the SIAI could gain from an affiliation with those inside the higher status AGI communities. Knowing who to know has many uses unrelated to knowing what to know.
Indeed. I read part of this post as implying that his position had at least a little bit to do with gaining status from affiliating with him (“It has similarly been a general rule with the Singularity Institute that, whatever it is we’re supposed to do to be more credible, when we actually do it, nothing much changes. ‘Do you do any sort of code development? I’m not interested in supporting an organization that doesn’t develop code’ → OpenCog → nothing changes. ‘Eliezer Yudkowsky lacks academic credentials’ → Professor Ben Goertzel installed as Director of Research → nothing changes.”).
That’s an impressive achievement! I wonder if they will be able to maintain it? I also wonder whether they will be able to distinguish those times when the objections are solid, not merely something to treat as PR concerns. There is a delicate balance to be found.
Does this suggest that founding a stealth AGI institute (to coordinate conferences, and communication between researchers) might be suited to oversee and influence potential undertakings that could lead to imminent high-risk situations?
By the way, I noticed from my server logs that the Institute for Defense Analyses seems to be reading LW. They visited my homepage, referred by my LW profile. So one should think about the consequences of discussing such matters in public, respectively not doing so.
Most likely, someone working there just happens to.
Fascinating.
Can we know how you came to that conclusion?
There is one ‘mostly harmless’ for people who you think will fail at AGI. There is an entirely different ‘mostly harmless’ for actually have a research director who tries to make AIs that could kill us all. Why would I not think the SIAI is itself an existential risk if the criteria for director recruitment is so lax? Being absolutely terrified of disaster is the kind of thing that helps ensure appropriate mechanisms to prevent defection are kept in place.
Yes. The SIAI has to convince us that they are mostly harmless.
Phew...I was almost going to call bullshit on this but that would be impolite.
That is an excellent question.
And now for a truly horrible thought:
I wonder to what extent we’ve been “saved” so far by anthropics. Okay, that’s probably not the dominant effect. I mean, yeah, it’s quite clear that AI is, as you note, REALLY hard.
But still, I can’t help but wonder just how little or much that’s there.
If you think anthropics has saved us from AI many times, you ought to believe we will likely die soon, because anthropics doesn’t constrain the future, only the past. Each passing year without catastrophe should weaken your faith in the anthropic explanation.
The first sentence seems obviously true to me, the second probably false.
My reasoning: to make observations and update on them, I must continue to exist. Hence I expect to make the same observations & updates whether or not the anthropic explanation is true (because I won’t exist to observe and update on AI extinction if it occurs), so observing a “passing year without catastrophe” actually has a likelihood ratio of one, and is not Bayesian evidence for or against the anthropic explanation.
Wouldn’t the anthropic argument apply just as much in the future as it does now? The world not being destroyed is the only observable result.
The future hasn’t happened yet.
Right. My point was in the future you are still going to say “wow the world hasn’t been destroyed yet” even if in 99% of alternate realities it was. cousn_it said:
Which shouldn’t be true at all.
If you can not observe a catastrophe happen, then not observing a catastrophe is not evidence for any hypothesis.
“Not observing a catastrophe” != “observing a non-catastrophe”. If I’m playing russian roulette and I hear a click and survive, I see good reason to take that as extremely strong evidence that there was no bullet in the chamber.
But doesn’t the anthropic argument still apply? Worlds where you survive playing russian roulette are going to be ones where there wasn’t a bullet in the chamber. You should expect to hear a click when you pull the trigger.
As it stands, I expect to die (p=1/6) if I play russian roulette. I don’t hear a click if I’m dead.
That’s the point. You can’t observe anything if you are dead, therefore any observations you make are conditional on you being alive.
Those universes where you die still exist, even if you don’t observe them. If you carry your logic to its conclusion, there would be no risk to playing russian roulette, which is absurd.
The standard excuse given by those who pretend to believe in many worlds is that you are likely to get maimed in the universes where you get shot but don’t die, which is somewhat unpleasant. If you come up with a more reliable way to quantum suicide, like using a nuke, they find another excuse.
Methinks that is still a lack of understanding, or a disagreement on utility calculations. I myself would rate the universes where I die as lower utility still than those were I get injured (indeed the lowest possible utility).
Better still if in all the universes I don’t die.
I do think ‘a disagreement on utility calculations’ may indeed be a big part of it. Are you a total utilitarian? I’m not. A big part of that comes from the fact that I don’t consider two copies of myself to be intrinsically more valuable than one—perhaps instrumentally valuable, if those copies can interact, sync their experiences and cooperate, but that’s another matter. With experience-syncing, I am mostly indifferent to the number of copies of myself to exist (leaving aside potential instrumental benefits), but without it I evaluate decreasing utility as the number of copies increases, as I assign zero terminal value to multiplicity but positive terminal value to the uniqueness of my identity.
My brand of utilitarianism is informed substantially by these preferences. I adhere to neither average nor total utilitarianism, but I lean closer to average. Whilst I would be against the use of force to turn a population of 10 with X utility each into a population of 3 with (X + 1) utility each, I would in isolation consider the latter preferable to the former (there is no inconsistency here—my utility function simply admits information about the past).
That line of thinking leads directly to recommending immediate probabilistic suicide, or at least indifference to it. No thanks.
How so?
I’m saying that you can only observe not dying. Not that you shouldn’t care about universes that you don’t exist in or observe.
The risk in Russian roulette is, in the worlds where you do survive you will probably be lobotomized, or drop the gun shooting someone else, etc. Ignoring that, there is no risk. As long as you don’t care about universes where you die.
Ok. I find this assumption absolutely crazy, but at least I comprehend what you are saying now.
Well think of it this way. You are dead/non-existent in the vast majority of universes as it is.
How is that relevant? If I take some action that results in the death of myself in some other Everett branch, then I have killed a human being in the multiverse.
Think about applying your argument to this universe. You shoot someone in the head, they die instantly, and then you say to the judge “well think of it this way: he’s not around to experience this. besides, there’s other worlds where I didn’t shoot him, so he’s not really dead!”
You can’t appeal to common sense. That’s the point of quantum immortality, it defies our common sense notions about death. Obviously, since we are used to assuming single-threaded universe, where death is equivalent to ceasing to exist.
Of course, if you kill someone, you still cause that person pain in the vast majority of universes, as well as grieving to their family and friends.
If star-trek-style teleportation was possible by creating a clone and deleting the original, is that equivalent to suicide/murder/death? If you could upload your mind to a computer but destroy your biological brain, is that suicide, and is the upload really you? Does destroying copies really matter as long as one lives on (assuming the copies don’t suffer)?
You absolutely appeal to common sense on moral issues. Morality is applied common sense, in the Minsky view of “common sense” being an assortment of deductions and inferences extracted from the tangled web of my personal experiential and computational history. Morality is the result of applying that common sense knowledgebase against possible actions in a planning algorithm.
Quantum “immortality” involves a sudden, unexpected, and unjustified redefinition of “death.” That argument works if you buy the premise. But, I don’t.
If you are saying that there is no difference between painlessly, instantaneously killing someone in one branch while letting them live another, verses letting that person live in both, then I don’t know how to proceed. If you’re going to say that then you might as well make yourself indifferent to the arrow of time as well, in which case it doesn’t matter if that person dies in all branches because he still “exists” in history.
Now I no longer know what we are talking about. According to my morality, it is wrong to kill someone. The existence of other branches where that person does not die does not have even epsilon difference on my evaluation of moral choices in this world. The argument from the other side seems inconsistent to me.
And yes, star trek transporters and destructive uploaders are death machines, a position I’ve previously articulated on lesswrong.
You are appealing to a terminal value that I do not share. I think caring about clones is absurd. As long as one copy of me lives, what difference does it make if I create and delete a thousand others? It doesn’t change my experience or theirs. Nothing would change and I wouldn’t even be aware of it.
From my point of view, I do not like the thought that I might be arbitrarily deleted by a clone of myself. I therefore choose to commit to not deleting clones of myself; thus preventing myself from being deleted by any clones that share that commitment.
I don’t think this is quite true (it can redistribute probability between some hypotheses). But this strengthens your position rather than weakening it.
Ok, correct.
Retracted: Not correct. What was I thinking? Just because you don’t observe the universes where the world was destroyed, doesn’t mean those universes don’t exist.
That’s the justification he gave me: he won’t be able to make much of a difference to the subject, so he won’t be generating much risk.
Since he’s going to do it anyway, I was wondering whether there were safer ways of doing so.