List of allusions I managed to catch (part 1):
Alderson starlines—Alderson Drive
Giant Science Vessel—GSV—General Systems Vehicle
Lord Programmer—allusion to the archeologist programmers in Vernor Vinge’s A Fire Upon the Deep?
Greater Archive—allusion to Orion’s Arm’s Greater Archives?
Sebastian_Hagen2
Will Wilkinson said at 50:48:
People will shout at you in germany if you jaywalk, I’m told.
I can’t say for sure this doesn’t happen anywhere in Germany, but it’s definitely not a universal in German society. Where I live, jaywalking is pretty common and nobody shouts at people for doing it unless they force a driver to brake or swerve by doing so.
I’d be relieved if the reason were that you ascribed probability significantly greater than 1% to a Long Slump, but I suspect it’s because you worry humanity will run out of time in many of the other scenarios before FAI work is finished- reducing you to looking at the Black Swan possibilities within which the world might just be saved.
If this is indeed the reason for Eliezer considering this specific outcome, that would suggest that deliberately depressing the economy is a valid Existential Risk-prevention tactic.
This use of the word ‘wants’ struck me as a distinction Eliezer would make, rather than this character.
Similarly, it’s notable that the AI seems to use exactly the same interpretation of the word lie as Eliezer Yudkowsky: that’s why it doesn’t self-describe as an “Artificial Intelligence” until the verthandi uses the phrase.Also, at the risk of being redundant: Great story.
To add to Abigail’s point: Is there significant evidence that the critically low term in the Drake Equation isn’t f_i (i.e. P(intelligence|life))? If natural selection on earth hadn’t happened to produce an intelligent species, I would assign a rather low probability of any locally evolved life surviving the local sun going nova. I don’t see any reasonable way of even assigning a lower bound to f_i.
The of helping someone, …
Missing word?
Okay, so no one gets their driver’s license until they’ve built their own Friendly AI, without help or instruction manuals. Seems to me like a reasonable test of adolescence.
Does this assume that they would be protected from any consequences of messing the Friendliness up and building a UFAI by accident? I don’t see a good solution to this. If people are protected from being eaten by their creations, they can slog through the problem using a trial-and-error approach through however many iterations it takes. If they aren’t, this is going to be one deadly test.
Up to now there never seemed to be a reason to say this, but now that there is:
Eliezer Yudkowsky, afaict you’re the most intelligent person I know. I don’t know John Conway.
It’s easier to say where someone else’s argument is wrong, then to get the fact of the matter right;
Did you mean s/then/than/?You posted your raw email address needlessly. Yum.
Posting it here didn’t really change anything.How can you tell if someone is an idiot not worth refuting, or if they’re a genius who’s so far ahead of you to sound crazy to you? Could we think an AI had gone mad, and reboot it, when it is really genius.
You can tell by the effect they have on their environment. If it’s stupid, but it works, it’s not stupid. This can be hard to do precisely if you don’t know the entity’s precise goals, but in general if they manage to do interesting things you couldn’t (e.g. making large amounts of money, writing highly useful software, obtaining a cult of followers or converting planets into computronium), they’re probably doing something right.In the case of you considering taking action against the entity (as in your example of deleting the AI), this is partly self-regulating: A sufficiently intelligent entity should see such an attack coming and have effective countermeasures in place (for instance, by communicating better to you so you don’t conclude it has gone mad). If you attack it and succeed, that by itself places limits on how intelligent the target really was. Note that this part doesn’t work if both sides are unmodified humans, because the relative differences in intelligence aren’t large enough.
Do you really truly think that the rational thing for both parties to do, is steadily defect against each other for the next 100 rounds?
No. That seems obviously wrong, even if I can’t figure out where the error lies.
We only get a reversion to the (D,D) case if we know with a high degree of confidence that the other party doesn’t use naive Tit for Tat, and they know that we don’t. That seems like an iffy assumption to me. If we knew the exact algorithm the other side uses, it would be trivial to find a winning strategy; so how do we know it isn’t naive Tit for Tat? If there’s a sufficiently high chance the other side is using naive Tit for Tat, it might well be optimal to repeat their choices until the second-to-last round.
Definitely defect. Cooperation only makes sense in the iterated version of the PD. This isn’t the iterated case, and there’s no prior communication, hence no chance to negotiate for mutual cooperation (though even if there was, meaningful negotiation may well be impossible depending on specific details of the situation). Superrationality be damned, humanity’s choice doesn’t have any causal influence on the paperclip maximizer’s choice. Defection is the right move.
Nitpicking your poison category:
What is a poison? … Carrots, water, and oxygen are “not poison”. … (… You’re really asking about fatality from metabolic disruption, after administering doses small enough to avoid mechanical damage and blockage, at room temperature, at low velocity.)
If I understand that last definition correctly, it should classify water as a poison.
Doug S.:
What character is ◻?
That’s u+25FB (‘WHITE MEDIUM SQUARE’).Eliezer Yudkowsky:
Larry, interpret the smiley face as saying:
PA + (◻C → C) |- I’m still struggling to completely understand this. Are you also changing the meaning of ◻ from ‘derivable from PA’ to ‘derivable from PA + (◻C → C)’? If so, are you additionally changing L to use provability in PA + (◻C → C) instead of provability in PA?
Quick correction: s/abstract rational reasoning/abstract moral reasoning/
Jadagul:
But my moral code does include such statements as “you have no fundamental obligation to help other people.” I help people because I like to.
While I consider myself an altruist in principle (I have serious akrasia problems in practice), I do agree with this statement. Altruists don’t have any obligation to help people, it just often makes sense for them to do so; sometimes it doesn’t, and then the proper thing for them is not to do it.Roko:
In the modern world, people have to make moral choices using their general intelligence, because there aren’t enough “yuck” and “yum” factors around to give guidance on every question. As such, we shouldn’t expect much more moral agreement from humans than from rational (or approximately rational) AIs.
There might not be enough “yuck” and “yum” factors around to offer direct guidance on every question, but they’re still the basis for abstract rational reasoning. Do you think “paperclip optimizer”-type AIs are impossible? If so, why? There’s nothing incoherent about a “maximize the number of paperclips over time” optimization criterion; if anything, it’s a lot simpler than those in use by humans.Eliezer Yudkowsky:
If I have a value judgment that would not be interpersonally compelling to a supermajority of humankind even if they were fully informed, then it is proper for me to personally fight for and advocate that value judgment, but not proper for me to preemptively build an AI that enforces that value judgment upon the rest of humanity.
I don’t understand this at all. How is building a superintelligent AI not just a (highly effective, if you do it right) special method of personally fighting for your value judgement? Are you saying it’s ok to fight for it, as long as you don’t do it too effectively?
I think my highest goal in life is to make myself happy. Because I’m not a sociopath making myself happy tends to involve having friends and making them happy. But the ultimate goal is me.
If you had a chance to take a pill which would cause you to stop caring about your friends by permanently maxing out that part of your hapiness function regardless of whether you had any friends, would you take it?
Do non-psychopaths that given the chance would self-modify into psychopaths fall into the same moral reference frame as stable psychopaths?
After all, if the humans have something worth treating as spoils, then the humans are productive and so might be even more useful alive.
Humans depend on matter to survive, and increase entropy by doing so. Matter can be used for storage and computronium, negentropy for fueling computation. Both are limited and valuable (assuming physics doesn’t allow for infinite-resource cheats) resources.I read stuff like this and immediately my mind thinks, “comparative advantage.” The point is that it can be (and probably is) worthwhile for Bob and Bill to trade with each other even if Bob is better at absolutely everything than Bill.
Comparative advantage doesn’t matter for powerful AIs at massively different power levels. It exists between some groups of humans because humans don’t differ in intelligence all that much when you consider all of mind design space, and because humans don’t have the means to easily build subservient-to-them minds which are equal in power to them.
What about a situation where Bob can defet Bill very quickly, take all its resources, and use them to implement a totally-subservient-to-Bob mind which is by itself better at everything Bob cares about than Bill was? Resolving the conflict takes some resources, but leaving Bill to use them a) inefficiently and b) for not-exactly-Bob’s goals might waste (Bob’s perspective) even more of them in the long run. Also, eliminating Bill means Bob has to worry about one less potential threat that it would otherwise need to keep in check indefinitely.The FAI may be an unsolvable problem, if by FAI we mean an AI into which certain limits are baked.
You don’t want to build an AI with certain goals and then add on hard-coded rules that prevent it from fulfilling those goals with maximum efficiency. If you put your own mind against that of the AI, a sufficiently powerful AI will always win that contest. The basic idea behind FAI is to build an AI that genuinely wants good things to happen; you can’t control it after it takes off, so you put in your conception of “good” (or an algorithm to compute it) into the original design, and define the AI’s terminal values based on that. Doing this right is an extremely tough technical problem, but why do you believe it may be impossible?
Constant [sorry for getting the attribution wrong in my previous reply] wrote:
We do not know very well how the human mind does anything at all. But that the the human mind comes to have preferences that it did not have initially, cannot be doubted.
I do not know whether those changes in opinion indicate changes in terminal values, but it doesn’t really matter for the purposes of this discussion, since humans aren’t (capital-F) Friendly. You definitely don’t want an FAI to unpredictably change its terminal values. Figuring out how to reliably prevent this kind of thing from happening, even in a strongly self-modifying mind (which humans aren’t), is one of the sub-problems of the FAI problem.
To create a society of AIs, hoping they’ll prevent each other from doing too much damage, isn’t a viable solution to the FAI problem, even in the rudimentary “doesn’t kill all humans” sense. There’s various problems with the idea, among them:
Any two AIs are likely to have a much vaster difference in effective intelligence than you could ever find between two humans (for one thing, their hardware might be much more different than any two working human brains). This likelihood increases further if (at least) some subset of them is capable of strong self-improvement. With enough difference in power, cooperation becomes a losing strategy for the more powerful party.
The AIs might agree that they’d all be better off if they took the matter currently in use by humans for themselves, dividing the spoils among each other.
TGGP wrote:
We’ve been told that a General AI will have power beyond any despot known to history.
Unknown replied:If that will be then we are doomed. Power corrupts. In theory an AI, not being human, might resist the corruption, but I wouldn’t bet on that. I do not think it is a mere peculiarity of humanity that we are vulnerable to corruption.
A tendency to become corrupt when placed into positions of power is a feature of some minds. Evolutionary psychology explains nicely why humans have evolved this tendency. It also allows you to predict that other intelligent organisms, evolved in a sufficiently similar way, would be likely to have a similar feature.
Humans having this kind of tendency is a predictable result of what their design was optimized to do, and as such them having it doesn’t imply much for minds from a completely different part of mind design space.
What makes you think a human-designed AI would be vulnerable to this kind of corruption?
It’s interesting to note that those oh-so-advanced humans prefer to save children to saving adults, even though there don’t seem to be any limits to natural lifespan anymore.
At our current tech-level this kind of thing can make sense because adults have less lifespan left; but without limits on natural lifespan (or neural degradation because of advanced age) older humans have, on average, had more resources invested into their development—and as such should on average be more knowledgeable, more productive and more interesting people.
It appears to me that the decision to save human children in favor of adults is a result of executing obsolete adaptions as opposed to shutting up and multiplying. I’m surprised nobody seems to have mentioned this yet—am I missing something obvious?