It looks to me as though you’ve focused in on one of the weaker points in XiXiDu’s post rather than engaging with the (logically independent) stronger points.
XiXiDu wants to know why he can trust SIAI instead of Charles Stross. Reading the MWI sequence is supposed to tell him that far more effectively than any cute little sentence I could write. The first thing I need to know is whether he read the sequence and something went wrong, or if he didn’t read the sequence.
Well, you’ve picked the weakest of his points to answer, and I put it to you that it was clearly the weakest.
You are right of course that what does or doesn’t show up in Charles Stross’s writing doesn’t constitute evidence in either direction—he’s a professional fiction author, he has to write for entertainment value regardless of what he may or may not know or believe about what’s actually likely or unlikely to happen.
A better example would be e.g. Peter Norvig, whose credentials are vastly more impressive than yours (or, granted, than mine), and who thinks we need to get at least another couple of decades of progress under our belts before there will be any point in resuming attempts to work on AGI. (Even I’m not that pessimistic.)
If you want to argue from authority, the result of that isn’t just tilted against the SIAI, it’s flat out no contest.
A better example would be e.g. Peter Norvig, whose credentials are vastly more impressive than yours (or, granted, than mine), and who thinks we need to get at least another couple of decades of progress under our belts before there will be any point in resuming attempts to work on AGI. (Even I’m not that pessimistic.)
If this means “until the theory and practice of machine learning is better developed, if you try to build an AGI using existing tools you will very probably fail” it’s not unusually pessimistic at all. “An investment of $X in developing AI theory will do more to reduce the mean time to AI than $X on AGI projects using existing theory now” isn’t so outlandish either. What was the context/cite?
I don’t have the reference handy, but he wasn’t saying let’s spend 20 years of armchair thought developing AGI theory before we start writing any code (I’m sure he knows better than that), he was saying forget about AGI completely until we’ve got another 20 years of general technological progress under our belts.
Those would seem likely to be helpful indeed. Better programming tools might also help, as would additional computing power (not so much because computing power is actually a limiting factor today, as because we tend to scale our intuition about available computing power to what we physically deal with on an everyday basis—which for most of us, is a cheap desktop PC—and we tend to flinch away from designs whose projected requirements would exceed such a cheap PC; increasing the baseline makes us less likely to flinch away from good designs).
Here too, it looks like you’re focusing on a weak aspect of his post rather than engaging him. Nobody who’s smart and has read your writing carefully doubts that you’re uncommonly brilliant and that this gives you more credibility than the other singulatarians. But there are more substantive aspects of XiXiDu’s post which you’re not addressing.
Like what? Why he should believe in exponential growth? When by “exponential” he actually means “fast” and no one at SIAI actually advocates for exponentials, those being a strictly Kurzweilian obsession and not even very dangerous by our standards? When he picks MWI, of all things, to accuse us of overconfidence (not “I didn’t understand that” but “I know something you don’t about how to integrate the evidence on MWI, clearly you folks are overconfident”)? When there’s lots of little things scattered through the post like that (“I’m engaging in pluralistic ignorance based on Charles Stross’s nonreaction”) it doesn’t make me want to plunge into engaging the many different little “substantive” parts, get back more replies along the same line, and recapitulate half of Less Wrong in the process. The first thing I need to know is whether XiXiDu did the reading and the reading failed, or did he not do the reading? If he didn’t do the reading, then my answer is simply, “If you haven’t done enough reading to notice that Stross isn’t in our league, then of course you don’t trust SIAI”. That looks to me like the real issue. For substantive arguments, pick a single point and point out where the existing argument fails on it—don’t throw a huge handful of small “huh?”s at me.
Castles in the air. Your claims are based on long chains of reasoning that you do not write down in a formal style. Is the probability of correctness of each link in that chain of reasoning so close to 1, that their product is also close to 1?
I can think of a couple of ways you could respond:
Yes, you are that confident in your reasoning. In that case you could explain why XiXiDu should be similarly confident, or why it’s not of interest to you whether he is similarly confident.
It’s not a chain of reasoning, it’s a web of reasoning, and robust against certain arguments being off. If that’s the case, then we lay readers might benefit if you would make more specific and relevant references to your writings depending on context, instead of encouraging people to read the whole thing before bringing criticisms.
Most of the long arguments are concerned with refuting fallacies and defeating counterarguments, which flawed reasoning will always be able to supply in infinite quantity. The key predictions, when you look at them, generally turn out to be antipredictions, and the long arguments just defeat the flawed priors that concentrate probability into anthropomorphic areas. The positive arguments are simple, only defeating complicated counterarguments is complicated.
“Fast AI” is simply “Most possible artificial minds are unlikely to run at human speed, the slow ones that never speed up will drop out of consideration, and the fast ones are what we’re worried about.”
“UnFriendly AI” is simply “Most possible artificial minds are unFriendly, most intuitive methods you can think of for constructing one run into flaws in your intuitions and fail.”
MWI is simply “Schrodinger’s equation is the simplest fit to the evidence”; there are people who think that you should do something with this equation other than taking it at face value, like arguing that gravity can’t be real and so needs to be interpreted differently, and the long arguments are just there to defeat them.
The only argument I can think of that actually approaches complication is about recursive self-improvement, and even there you can say “we’ve got a complex web of recursive effects and they’re unlikely to turn out exactly exponential with a human-sized exponent”, the long arguments being devoted mainly to defeating the likes of Robin Hanson’s argument for why it should be exponential with an exponent that smoothly couples to the global economy.
One problem I have with your argument here is that you appear to be saying that if XiXiDu doesn’t agree with you, he must be stupid (the stuff about low g etc.). Do you think Robin Hanson is stupid too, since he wasn’t convinced?
I haven’t found the text during a two minute search or so, but I think I remember Robin assigning a substantial probability, say, 30% or so, to the possibility that MWI is false, even if he thinks most likely (i.e. the remaining 70%) that it’s true.
Much as you argued in the post about Einstein’s arrogance, there seems to be a small enough difference between a 30% chance of being false, and a 90% chance of being false, if the latter would imply that Robin was stupid, the former would imply it too.
Right: in fact he would act as though MWI is certainly false… or at least as though Quantum Immortality is certainly false, which has a good chance of being true given MWI.
Quantum Immortality is certainly false, which has a good chance of being true given MWI.
No! He will act as if Quantum Immortality is a bad choice, which is true even if QI works exactly as described. ‘True’ isn’t the right kind word to use unless you include a normative conclusion in the description of QI.
Suppose that being shot with the gun cannot possibly have intermediate results: either the gun fails, or he is killed instantly and painlessly.
Also suppose that given that there are possible worlds where he exists, each copy of him only cares about its anticipated experiences, not about the other copies, and that this is morally the right thing to do… in other words, if he expects to continue to exist, he doesn’t care about other copies that cease to exist. This is certainly the attitude some people would have, and we could suppose (for the LCPW) that it is the correct attitude.
Even so, given these two suppositions, I suspect it would not affect his behavior in the slightest, showing that he would be acting as though QI is certainly false, and therefore as though there is a good chance that MWI is false.
each copy of him only cares about its anticipated experiences, not about the other copies, and that this is morally the right thing to do… in other words, if he expects to continue to exist, he doesn’t care about other copies that cease to exist.
But that is crazy and false, and uses ‘copies’ to in a misleading way. Why would I assume that?
Even so, given these two suppositions, I suspect it would not affect his behavior in the slightest, showing that he would be acting as though QI is certainly false,
This ‘least convenient possible world’ is one in which Robin’s values are changed according to your prescription but his behaviour is not, ensuring that your conclusion is true. That isn’t the purpose of inconvenient worlds (kind of the opposite...)
and therefore as though there is a good chance that MWI is false.
Not at all. You are conflating “MWI is false” with a whole different set of propositions. MWI != QS.
Many people in fact have those values and opinions, and nonetheless act in the way I mention (and there is no one who does not so act) so it is quite reasonable to suppose that even if Robin’s values were so changed, his behavior would remain unchanged.
The very reason Robin was brought up (by you I might add) was to serve as an ad absurdum with respect to intellectual disrespect.
One problem I have with your argument here is that you appear to be saying that if XiXiDu doesn’t agree with you, he must be stupid (the stuff about low g etc.). Do you think Robin Hanson is stupid too, since he wasn’t convinced?
In the Convenient World where Robin is, in fact, too stupid to correctly tackle the concept of QS, understand the difference between MWI and QI or form a sophisticated understanding of his moral intuitions with respect to quantum uncertainty this Counterfactual-Stupid-Robin is a completely useless example.
I can imagine two different meanings for “not convinced about MWI”
It refers to someone who is not convinced that MWI is as good as any other model of reality, and better than most.
It refers to someone who is not convinced that MWI describes the structure of reality.
If we are meant to understand the meaning as #1, then it may well indicate that someone is stupid. Though, more charitably, it might more likely indicate that he is ignorant.
If we are meant to understand the meaning as #2, then I think that it indicates someone who is not entrapped by the Mind Projection Fallacy.
What do you mean by belief in MWI? What sort of experiment could settle whether MWI is true or not?
I suspect that a lot of people object to the stuff including copies of humans and other worlds we should care about and hypotheses about consciousness tacitly build on MWI, rather than MWI itself.
First, the links say that MWI needs a linear quantum theory, and lists therefore the linearity among its predictions. However, linearity is a part of the quantum theory and its mathematical formalism, and nothing specific to MWI. Also, weak non-linearity would be explicable using the language of MWI saying that the different worlds interact a little. I don’t see how testing the superposition principle establishes MWI. A very weak evidence at best.
Second, there is a very confused paragraph about quantum gravity, which, apart from linking to itself, states only that MWI requires gravity to be quantised (without supporting argument) and therefore if gravity is successfully quantised, it forms evidence for MWI. However, nobody doubts that gravity has to be quantised somehow, even hardcore Copenhageners.
The most interesting part is that about the reversible measurement done by an artificial intelligence. As I understand it, it supposes that we construct a machine which could perform measurements in reversed direction of time, for which it has to be immune to quantum decoherence. It sounds interesting, but is also suspicious. I see no way how can we get the information into our brains without decoherence. The argument apparently tries to circumvent this objection by postulating an AI, which is reversible and decoherence-immune, but the AI will still face the same problem when trying to tell us the results. In fact, postulating the need of an AI here seems to be only a tool to make the proposed experiment more obscure and difficult to analyse. We will have a “reversible AI”, therefore miraculously we will detect differences between Copenhagen and MWI.
However, at least there is a link to Deutsch’s article which hopefully explains the experiment in greater detail, so I will read it and edit the comment later.
“Many-worlds is often referred to as a theory, rather than just an interpretation, by those who propose that many-worlds can make testable predictions (such as David Deutsch) or is falsifiable (such as Everett) or by those who propose that all the other, non-MW interpretations, are inconsistent, illogical or unscientific in their handling of measurements”
None of the tests in that FAQ look to me like they could distinguish MWI from MWI+worldeater. The closest thing to an experimental test I’ve come up with is the following:
Flip a quantum coin. If heads, copy yourself once, advance both copies enough to observe the result, then kill one of the copies. If tails, do nothing.
In a many-worlds interpretation of QM, from the perspective of the experimenter, the coin will be heads with probability 2⁄3, since there are two observers in that case and only one if the coin was tails. In the single-world case, the coin will be heads with probability 1⁄2. So each time you repeat the experiment, you get 0.4 bits of evidence for or against MWI. Unfortunately, this evidence is also non-transferrable; someone else can’t use your observation as evidence the same way you can. And getting enough evidence for a firm conclusion involves a very high chance of subjective death (though it is guaranteed that exactly one copy will be left behind). And various quantum immortality hypotheses screw up the experiment, too.
So it is testable in principle, but the experiment involved more odious than one would imagine possible.
The math works the same in all interpretations, but some experiments are difficult to understand intuitively without the MWI. I usually give people the example of the Elitzur-Vaidman bomb tester where the easy MWI explanation says “we know the bomb works because it exploded in another world”, but other interpretations must resort to clever intellectual gymnastics.
If all interpretations are equivalent with respect to testable outcomes, what makes the belief in any particular interpretation so important? Ease of intuitive understanding is a dangerous criterion to rely on, and a relative thing too. Some people are more ready to accept mental gymnastic than existence of another worlds.
Well, that depends. Have you actually tried to do the mental gymnastics and explain the linked experiment using the Copenhagen interpretation? I suspect that going through with that may influence your final opinion.
Have you actually tried to do the mental gymnastics and explain the linked experiment [the Elitzur-Vaidman bomb tester] using the Copenhagen interpretation?
Maybe I’m missing something, but how exactly does this experiment challenge the Copenhagen interpretation more than the standard double-slit stuff? Copenhagen treats “measurement” as a fundamental and irreducible process and measurement devices as special components in each experiment—and in this case it simply says that a dud bomb doesn’t represent a measurement device, whereas a functioning one does, so that they interact with the photon wavefunction differently. The former leaves it unchanged, while the latter collapses it to one arm of the interferometer—eiher its own, in which case it explodes, or the other one, in which case it reveals itself as a measurement device just by the act of collapsing.
As far as I understand, this would be similar to the standard variations on the double-slit experiment where one destroys the interference pattern by placing a particle detector at the exit from one of the holes. One could presumably do a similar experiment with a detector that might be faulty, and conclude that an interference-destroying detector works even if it doesn’t flash when several particles are let through (in cases where they all happen to go through the other hole). Unless I’m misunderstanding something, this would be a close equivalent of the bomb test.
The final conclusion in the bomb test is surely more spectacular, but I don’t see how it produces any extra confusion for Copenhageners compared to the most basic QM experiments.
Frankly, I don’t know what you consider an explanation here. I am quite comfortable with the prediction which the theory gives, and accept that as an explanation. So I never needed mental gymnastics here. The experiment is weird, but it doesn’t seem to me less weird by saying that the information about the bomb’s functionality came from its explosion in the other world.
Your claims are only anti-predictions relative to science-fiction notions of robots as metal men.
Most possible artificial minds are neither Friendly nor unFriendly (unless you adopt such a stringent definition of mind that artificial minds are not going to exist in my lifetime or yours).
Fast AI (along with most of the other wild claims about what future technology will do, really) falls afoul of the general version of Amdahl’s law. (On which topic, did you ever update your world model when you found out you were mistaken about the role of computers in chip design?)
About MWI, I agree with you completely, though I am more hesitant to berate early quantum physicists for not having found it obvious. For a possible analogy: what do you think of my resolution of the Anthropic Trilemma?
This is quite helpful, and suggests that what I wanted is not a lay-reader summary, but an executive summary.
I brought this up elsewhere in this thread, but the fact that quantum mechanics and gravity are not reconciled suggests that even Schrodinger’s equation does not fit the evidence. The “low-energy” disclaimer one has to add is very weird, maybe weirder than any counterintuitive consequences of quantum mechanics.
I brought this up elsewhere in this thread, but the fact that quantum mechanics and gravity are not reconciled suggests to be that even Schrodinger’s equation does not fit the evidence. The “low-energy” disclaimer one has to add is very weird, maybe weirder than any counterintuitive consequences of quantum mechanics.
It’s not the Schrödinger equation alone that gives rise to decoherence and thus many-worlds. (Read Good and Real for another toy model, the “quantish” system.) The EPR experiment and Bell’s inequality can be made to work on macroscopic scales, so we know that whatever mathematical object the universe will turn out to be, it’s not going to go un-quantum on us again: it has the same relevant behavior as the Schrödinger equation, and accordingly MWI will be the best interpretation there as well.
“There is no intangible stuff of goodness that you can divorce from life and love and happiness in order to ask why things like that are good. They are simply what you are talking about in the first place when you talk about goodness.”
And then the long arguments are about why your brain makes you think anything different.
This is less startling than your more scientific pronouncements. Are there any atheists reading this that find this (or at first found this) very counterintuitive or objectionable?
I would go further, and had the impression from somewhere that you did not go that far. Is that accurate?
I’m a cognitivist. Sentences about goodness have truth values after you translate them into being about life and happiness etc. As a general strategy, I make the queerness go away, rather than taking the queerness as a property of a thing and using it to deduce that thing does not exist; it’s a confusion to resolve, not an existence to argue over.
No, nothing, and because while religion does contain some confusion, after you eliminate the confusion you are left with claims that are coherent but false.
Morality is a specific set of values (Or, more precisely, a specific algorithm/dynamic for judging values). Humans happen to be (for various reasons) the sort of beings that value morality as opposed to valuing, say, maximizing paperclip production. It is indeed objectively better (by which we really mean “more moral”/”the sort of thing we should do”) to be moral than to be paperclipish. And indeed we should be moral, where by “should” we mean, “more moral”.
(And moral, when we actually cash out what we actually mean by it seems to translate to a complicated blob of values like happiness, love, creativity, novelty, self determination, fairness, life (as in protecting theirof), etc...)
It may appear that paperclip beings and moral beings disagree about something, but not really. The paperclippers would, once they’ve analyzed what humans actually mean by “moral”, would agree “yep, humans are more moral than us. But who cares about this morality stuff, it doesn’t maximize paperclips!”
Of course, screw the desires of the paperclippers, after all, they’re not actually moral. We really are objectively better (once we think carefully by what we mean by “better”) than them.
(note, “does something or does something not actually do a good job of fulfilling a certain value?” is an objective question. ie, “does a particular action tend to increase the expected number of paperclips?” (on the paperclipper side) or, on our side, stuff like “does a particular action tend to save more lives, increase happiness, increase fairness, add novelty...” etc etc etc is an objective question in that we can extract specific meaning from that question and can objectively (in a way the paperclippers would agree with) judge that. It simply happens to be that we’re the sorts of beings that actually care about the answer to that (as we should be), while the screwy hypothetical paperclippers are immoral and only care about paperclips.
How’s that, that make sense? Or, to summarize the summary, “Morality is objective, and we humans happen to be the sorts of beings that value morality, as opposed to valuing something else instead”
a specific algorithm/dynamic for judging values, or
a complicated blob of values like happiness, love, creativity, novelty, self determination, fairness, life (as in protecting theirof), etc.?
If it’s 1, can we say something interesting and non-trivial about the algorithm, besides the fact that it’s an algorithm? In other words, everything can be viewed as an algorithm, but what’s the point of viewing morality as an algorithm?
If it’s 2, why do we think that two people on opposite sides of the Earth are referring to the same complicated blob of values when they say “morality”? I know the argument about the psychological unity of humankind (not enough time for significant genetic divergence), but what about cultural/memetic evolution?
I’m guessing the answer to my first question is something like, morality is an algorithm whose current “state” is a complicated blob of values like happiness, love, … so both of my other questions ought to apply.
If it’s 2, why do we think that two people on opposite sides of the Earth are referring to the same complicated blob of values when they say “morality”? I know the argument about the psychological unity of humankind (not enough time for significant genetic divergence), but what about cultural/memetic evolution?
You don’t even have to do any cross-cultural comparisons to make such an argument. Considering the insights from modern behavioral genetics, individual differences within any single culture will suffice.
There is no reason to be at all tentative about this. There’s tons of cog sci data about what people mean when they talk about morality. It varies hugely (but predictably) across cultures.
Why are you using algorithm/dynamic here instead of function or partial function? (On what space, I will ignore that issue, just as you have...) Is it supposed to be stateful? I’m not even clear what that would mean. Or is function what you mean by #2? I’m not even really clear on how these differ.
You might have gotten confused because I quoted Psy-Kosh’s phrase “specific algorithm/dynamic for judging values” whereas Eliezer’s original idea I think was more like an algorithm for changing one’s values in response to moral arguments. Here are Eliezer’s own words:
I would say, by the way, that the huge blob of a computation is not just my present terminal values (which I don’t really have - I am not a consistent expected utility maximizers); the huge blob of a computation includes the specification of those moral arguments, those justifications, that would sway me if I heard them.
Others have pointed out that this definition is actually quite unlikely to be coherent: people would be likely to be ultimately persuaded by different moral arguments and justifications if they had different experiences and heard arguments in different orders etc.
Others have pointed out that this definition is actually quite unlikely to be coherent
Yes, see here for an argument to that effect by Marcello and subsequent discussion about it between Eliezer and myself.
I think the metaethics sequence is probably the weakest of Eliezer’s sequences on LW. I wonder if he agrees with that, and if so, what he plans to do about this subject for his rationality book.
I think the metaethics sequence is probably the weakest of Eliezer’s sequences on LW. I wonder if he agrees with that, and if so, what he plans to do about this subject for his rationality book.
This is somewhat of a concern given Eliezer’s interest in Friendliness!
As far as I can understand, Eliezer has promoted two separate ideas about ethics: defining personal morality as a computation in the person’s brain rather than something mysterious and external, and extrapolating that computation into smarter creatures. The former idea is self-evident, but the latter (and, by extension, CEV) has received a number of very serious blows recently. IMO it’s time to go back to the drawing board. We must find some attack on the problem of preference, latch onto some small corner, that will allow us to make precise statements. Then build from there.
defining personal morality as a computation in the person’s brain rather than something mysterious and external
But I don’t see how that, by itself, is a significant advance. Suppose I tell you, “mathematics is a computation in a person’s brain rather than something mysterious and external”, or “philosophy is a computation in a person’s brain rather than something mysterious and external”, or “decision making is a computation in a person’s brain rather than something mysterious and external” how much have I actually told you about the nature of math, or philosophy, or decision making?
This makes sense in that it is coherent, but it is not obvious to me what arguments would be marshaled in its favor. (Yudkowsky’s short formulations do point in the direction of their justifications.) Moreover, the very first line, “morality is a specific set of values,” and even its parenthetical expansion (algorithm for judging values), seems utterly preposterous to me. The controversies between human beings about which specific sets of values are moral, at every scale large and small, are legendary beyond cliche.
The controversies between human beings about which specific sets of values are moral, at every scale large and small, are legendary beyond cliche.
It is a common thesis here that most humans would ultimately have the same moral judgments if they were in full agreement about all factual questions and were better at reasoning. In other words, human brains have a common moral architecture, and disagreements are at the level of instrumental, rather than terminal, values and result from mistaken factual beliefs and reasoning errors.
You may or may not find that convincing (you’ll get to the arguments regarding that if you’re reading the sequences), but assuming that is true, then “morality is a specific set of values” is correct, though vague: more precisely, it is a very complicated set of terminal values, which, in this world, happens to be embedded solely in a species of minds who are not naturally very good at rationality, leading to massive disagreement about instrumental values (though most people do not notice that it’s about instrumental values).
It is a common thesis here that most humans would ultimately have the same moral judgments if they were in full agreement about all factual questions and were better at reasoning. In other words, human brains have a common moral architecture, and disagreements are at the level of instrumental, rather than terminal, values and result from mistaken factual beliefs and reasoning errors.
It is? That’s a worry. Consider this a +1 for “That thesis is totally false and only serves signalling purposes!”
I… think it is. Maybe I’ve gotten something terribly wrong, but I got the impression that this is one of the points of the complexity of value and metaethics sequences, and I seem to recall that it’s the basis for expecting humanity’s extrapolated volition to actually cohere.
I seem to recall that it’s the basis for expecting humanity’s extrapolated volition to actually cohere.
This whole area isn’t covered all that well (as Wei noted). I assumed that CEV would rely on solving an implicit cooperation problem between conflicting moral systems. It doesn’t appear at all unlikely to me that some people are intrinsically selfish to some degree and their extrapolated volitions would be quite different.
Note that I’m not denying that some people present (or usually just assume) the thesis you present. I’m just glad that there are usually others who argue against it!
It is a common thesis here that most humans would ultimately have the same moral judgments if they were in full agreement about all factual questions and were better at reasoning.
Maybe it’s true if you also specify “if they were fully capable of modifying their own moral intuitions.” I have an intuition (an unexamined belief? a hope? a sci-fi trope?) that humanity as a whole will continue to evolve morally and roughly converge on a morality that resembles current first-world liberal values more than, say, Old Testament values. That is, it would converge, in the limit of global prosperity and peace and dialogue, and assuming no singularity occurs and the average lifespan stays constant. You can call this naive if you want to; I don’t know whether it’s true. It’s what I imagine Eliezer means when he talks about “humanity growing up together”.
This growing-up process currently involves raising children, which can be viewed as a crude way of rewriting your personality from scratch, and excising vestiges of values you no longer endorse. It’s been an integral part of every culture’s moral evolution, and something like it needs to be part of CEV if it’s going to actually converge.
It is a common thesis here that most humans would ultimately have the same moral judgments if they were in full agreement about all factual questions and were better at reasoning.
That’s not plausible. That would be some sort of objective morality, and there is no such thing. Humans have brains, and brains are complicated. You can’t have them imply exactly the same preference.
Now, the non-crazy version of what you suggest is that preferences of most people are roughly similar, that they won’t differ substantially in major aspects. But when you focus on detail, everyone is bound to want their own thing.
It makes sense in its own terms, but it leaves the unpleasant implication that morality differs greatly between humans, at both individual and group level—and if this leads to a conflict, asking who is right is meaningless (except insofar as everyone can reach an answer that’s valid only for himself, in terms of his own morality).
So if I live in the same society with people whose morality differs from mine, and the good-fences-make-good-neighbors solution is not an option, as it often isn’t, then who gets to decide whose morality gets imposed on the other side? As far as I see, the position espoused in the above comment leaves no other answer than “might is right.” (Where “might” also includes more subtle ways of exercising power than sheer physical coercion, of course.)
...and if this leads to a conflict, asking who is right is meaningless (except insofar as everyone can reach an answer that’s valid only for himself, in terms of his own morality).
So if I live in the same society with people whose morality differs from mine, and the good-fences-make-good-neighbors solution is not an option, as it often isn’t, then who gets to decide whose morality gets imposed on the other side?
That two people mean different things by the same word doesn’t make all questions asked using that word meaningless, or even hard to answer.
If by “castle” you mean “a fortified structure”, while I mean “a fortified structure surrounded by a moat”, who will be right if we’re asked if the Chateau de Gisors is a castle? Any confusion here is purely semantic in nature. If you answer yes and I answer no, we won’t have given two answers to the same question, we’ll have given two answers to two different questions. If Psy-Kosh says that the Chateau de Gisors is a fortified structure but it is not surrounded by a moat, he’ll have answered both our questions.
Now, once this has been clarified, what would it mean to ask who gets to decide whose definition of ‘castle’ gets imposed on the other side? Do we need a kind of meta-definition of castle to somehow figure out what the one true definition is? If I could settle this issue by exercising power over you, would it change the fact that the Chateau de Gisors is not surrounded by a moat? If I killed everyone who doesn’t mean the same thing by the word ‘castle’ than I do, would the sentence “a fortified structure” become logically equivalent to the sentence “a fortified structure surrounded by a moat”?
In short, substituting the meaning of a word for the word tends to make lots of seemingly difficult problems become laughably easy to solve. Try it.
*blinks* how did I imply that morality varies? I thought (was trying to imply) that morality is an absolute standard and that humans simply happen to be the sort of beings that care about the particular standard we call “morality”. (Well, with various caveats like not being sufficiently reflective to be able to fully explicitly state our “morality algorithm”, nor do we fully know all its consequences)
However, when humans and paperclippers interact, well, there will probably be some sort of fight if one doesn’t end up with some sort PD cooperation or whatever. It’s not that paperclippers and humans disagree on anything, it’s simply, well, they value paperclips a whole lot more than lives. We’re sort of stuck with having to act in a way to prevent the hypothetical them from acting on that.
(of course, the notion that most humans seem to have the same underlying core “morality algorithm”, just disagreeing on the implications or such, is something to discuss, but that gets us out of executive summary territory, no?)
(of course, the notion that most humans seem to have the same underlying core “morality algorithm”, just disagreeing on the implications or such, is something to discuss, but that gets us out of executive summary territory, no?)
I would say that it’s a crucial assumption, which should be emphasized clearly even in the briefest summary of this viewpoint. It is certainly not obvious, to say the least. (And, for full disclosure, I don’t believe that it’s a sufficiently close approximation of reality to avoid the problem I emphasized above.)
Hrm, fair enough. I thought I’d effectively implied it, but apparently not sufficiently.
(Incidentally… you don’t think it’s a close approximation to reality? Most humans seem to value (to various extents) happiness, love, (at least some) lives, etc… right?)
Different people (and cultures) seem to put very different weights on these things.
Here’s an example:
You’re a government minister who has to decide who to hire to do a specific task. There are two applicants. One is your brother, who is marginally competent at the task. The other is a stranger with better qualifications who will probably be much better at the task.
The answer is “obvious.”
In some places, “obviously” you hire your brother. What kind of heartless bastard won’t help out his own brother by giving him a job?
In others, “obviously” you should hire the stranger. What kind of corrupt scoundrel abuses his position by hiring his good-for-nothing brother instead of the obviously superior candidate?
Okay, I can see how XiXiDu’s post might come across that way. I think I can clarify what I think that XiXiDu is trying to get at by asking some better questions of my own.
What evidence has SIAI presented that the Singularity is near?
If the Singularity is near then why has the scientific community missed this fact?
What evidence has SIAI presented for the existence of grey goo technology?
If grey goo technology is feasible then why has the scientific community missed this fact?
Assuming that the Singularity is near, what evidence is there that SIAI has a chance to lower global catastrophic risk in a nontrivial way?
What evidence is there that SIAI has room for more funding?
“Near”? Where’d we say that? What’s “near”? XiXiDu thinks we’re Kurzweil?
What kind of evidence would you want aside from a demonstrated Singularity?
Grey goo? Huh? What’s that got to do with us? Read Nanosystems by Eric Drexler or Freitas on “global ecophagy”. XiXiDu thinks we’re Foresight?
If this business about “evidence” isn’t a demand for particular proof, then what are you looking for besides not-further-confirmed straight-line extrapolations from inductive generalizations supported by evidence?
“Near”? Where’d we say that? What’s “near”? XiXiDu thinks we’re Kurzweil?
You’ve claimed that in your blogging heads divlog with Scott Aaronson that you think that it’s pretty obvious that there will be an AGI within the next century. As far as I know you have not offered a detailed description of the reasoning that led you to this conclusion that can be checked by others.
I see this as significant for the reasons given in my comment here.
Grey goo? Huh? What’s that got to do with us? Read Nanosystems by Eric Drexler or Freitas on “global ecophagy”. XiXiDu thinks we’re Foresight?
I don’t know what the situation is with SIAI’s position on grey goo—I’ve heard people say the SIAI staff believe in nanotechnology having capabilities out of line with the beliefs of the scientific community, but they may have been misinformed. So let’s forget about about questions 3 and 4.
You’ve claimed that in your blogging heads divlog with Scott Aaronson that you think that it’s pretty obvious that there will be an AGI within the next century.
You’ve shifted the question from “is SIAI on balance worth donating to” to “should I believe everything Eliezer has ever said”.
I don’t know what the situation is with SIAI’s position on grey goo—I’ve heard people say the SIAI staff believe in nanotechnology having capabilities out of line with the beliefs of the scientific community, but they may have been misinformed.
The point is that grey goo is not relevant to SIAI’s mission (apart from being yet another background existential risk that FAI can dissolve). “Scientific community” doesn’t normally professionally study (far) future technological capabilities.
My whole point about grey goo has been, as stated, that a possible superhuman AI could use it to do really bad things. That is, I do not see how an encapsulated AI, even a superhuman AI, could pose the stated risks without the use of advanced nanotechnology. Is it going to use nukes, like Skynet? Another question related to the SIAI, regarding advanced nanotechnology, is that if without advanced nanotechnology superhuman AI is at all possible.
I’m shocked how you people misintepreted my intentions there.
Grey goo is only a potential danger in its own right because it’s a way dumb machinery can grow in destructive power (you don’t need to assume AI controlling it for it to be dangerous, at least so goes the story). AGI is not dumb, so it can use something more fitting to precise control than grey goo (and correspondingly more destructive and feasible).
The grey goo example was named to exemplify the speed and sophistication of nanotechnology that would have to be around to either allow an AI to be build in the first place or be of considerable danger.
I consider your comment an expression of personal disgust. No way you could possible misinterpret my original point and subsequent explanation to this extent.
The grey goo example was named to exemplify the speed and sophistication of nanotechnology that would have to be around to either allow an AI to be build in the first place or be of considerable danger.
As katydee pointed out, if for some strange reason grey goo is what AI would want, AI will invent grey goo. If you used “grey goo” to refer to the rough level of technological development necessary to produce grey goo, then my comments missed that point.
I consider your comment an expression of personal disgust. No way you could possible misinterpret my original point and subsequent explanation to this extent.
Illusion of transparency. Since the general point about nanotech seems equally wrong to me, I couldn’t distinguish between the error of making it and making a similarly wrong point about the relevance of grey goo in particular. In general, I don’t plot, so take my words literally. If I don’t like something, I just say so, or keep silent.
If it seems equally wrong, why haven’t you pointed me to some further reasoning on the topic regarding the feasibility of AGI without advanced (grey goo level) nanotechnology? Why haven’t you argued about the dangers of AGI which is unable to make use of advanced nanotechnology? I was inquiring about these issues in my original post and not trying to argue against the scenarios in question.
Yes, I’ve seen the comment regarding the possible invention of advanced nanotechnology by AGI. If AGI needs something that isn’t there it will just pull it out of its hat. Well, I have my doubts that even a superhuman AGI can steer the development of advanced nanotechnology so that it can gain control of it. Sure, it might solve the problems associated with it and send the solutions to some researcher. Then it could buy the stocks of the subsequent company involved with the new technology and somehow gain control...well, at this point we are already deep into subsequent reasoning about something shaky that at the same time is used as evidence of the very reasoning involving it.
To the point: if AGI can’t pose a danger, because its hands are tied, that’s wonderful! Then we have more time to work of FAI. FAI is not about superpowerful robots, it’s about technically understanding what we want, and using that understanding to automate the manufacturing of goodness. The power is expected to come from unbounded automatic goal-directed behavior, something that happens without humans in the system to ever stop the process if it goes wrong.
Overall I’d feel a lot more comfortable if you just said “there’s a huge amount of uncertainty as to when existential risks will strike and which ones will strike, I don’t know whether or not I’m on the right track in focusing on Friendly AI or whether I’m right about when the Singularity will occur, I’m just doing the best that I can.”
This is largely because of the issue that I raise here
I should emphasize that I don’t think that you’d ever knowingly do something that raised existential risk, I think that you’re a kind and noble spirit. But I do think I’m raising a serious issue which you’ve missed.
If this business about “evidence” isn’t a demand for particular proof, then what are you looking for besides not-further-confirmed straight-line extrapolations from inductive generalizations supported by evidence?
I am looking for the evidence in “supported by evidence”. I am further trying to figure how you anticipate your beliefs to pay rent, what you anticipate to see if explosive recursive self-improvement is possible, and how that belief could be surprised by data.
If you just say, “I predict we will likely be wiped out by badly done AI.”, how do you expect to update on evidence? What would constitute such evidence?
To put my own spin on XiXiDu’s questions: What quality or position does Charles Stross possess that should cause us to leave him out of this conversation (other than the quality ‘Eliezer doesn’t think he should be mentioned’)?
What stronger points are you referring to? It seems to me XiXiDu’s post has only 2 points, both of which Eliezer addressed:
“Given my current educational background and knowledge I cannot differentiate LW between a consistent internal logic, i.e. imagination or fiction and something which is sufficiently based on empirical criticism to provide a firm substantiation of the strong arguments for action that are proclaimed on this site.”
His smart friends/favorite SF writers/other AI researchers/other Bayesians don’t support SIAI.
My point is that your evidence has to stand up to whatever estimations you come up with. My point is the missing transparency in your decision making regarding the possibility of danger posed by superhuman AI. My point is that any form of external peer review is missing and that therefore I either have to believe you or learn enough to judge all of your claims myself after reading hundreds of posts and thousands of documents to find some pieces of evidence hidden beneath. My point is that competition is necessary, that not just the SIAI should work on the relevant problems. There are many other points you seem to be missing entirely.
That one’s easy: We’re doing complex multi-step extrapolations argued to be from inductive generalizations themselves supported by the evidence, which can’t be expected to come with experimental confirmation of the “Yes, we built an unFriendly AI and it went foom and destroyed the world” sort. This sort of thing is dangerous, but a lot of our predictions are really antipredictions and so the negations of the claims are even more questionable once you examine them.
If you have nothing valuable to say, why don’t you stay away from commenting at all? Otherwise you could simply ask me what I meant to say, if something isn’t clear. But those empty statements coming from you recently make me question if you’ve been the person that I thought you are. You cannot even guess what I am trying to ask here? Oh come on...
I was inquiring about the supportive evidence at the origin of your complex multi-step extrapolations argued to be from inductive generalizations. If there isn’t any, what difference is there between writing fiction and complex multi-step extrapolations argued to be from inductive generalizations?
Agreed, and I think there’s a pattern here. XiXiDu is asking the right questions about why SIAI doesn’t have wider support. It is because there are genuine holes in its reasoning about the singularity, and SIAI chooses not to engage with serious criticism that gets at those holes. Example (one of many): I recall Shane Legg commenting that it’s not practical to formalize friendliness before anyone builds any form of AGI (or something to that effect). I haven’t seen SIAI give a good argument to the contrary.
Example (one of many): I recall Shane Legg commenting that it’s not practical to formalize friendliness before anyone builds any form of AGI (or something to that effect). I haven’t seen SIAI give a good argument to the contrary.
Gahhh! The hoard of arguments against that idea that instantly sprang to my mind (with warning bells screeching) perhaps hints at why a good argument hasn’t been given to the contrary (if, in fact, it hasn’t). It just seems so obvious. And I don’t mean that as a criticism of you or Shane at all. Most things that we already understand well seem like they should be obvious to others. I agree that there should be a post making the arguments on that topic either here on LessWrong or on the SIAI website somewhere. (Are you sure there isn’t?)
Edit: And you demonstrate here just why Eliezer (or others) should bother to answer XiXiDu’s questions even if there are some weaknesses in his reasoning.
My point is that Shane’s conclusion strikes me as the obvious one, and I believe many smart, rational, informed people would agree. It may be the case that, for the majority of smart, rational, informed people, there exists an issue X for which they think “obviously X” and SIAI thinks “obviously not X.” To be taken seriously, SIAI needs to engage with the X’s.
I understand your point, and agree that your conclusion is one that many smart, rational people with good general knowledge would share. Once again I concur that engaging with those X’s is important, including that ‘X’ we’re discussing here.
Sounds like we mostly agree. However, I don’t think it’s a question of general knowledge. I’m talking about smart, rational people who have studied AI enough to have strongly-held opinions about it. Those are the people who need to be convinced; their opinions propagate to smart, rational people who haven’t personally investigated AI in depth.
I’d love to hear your take on X here. What are your reasons for believing that friendliness can be formalized practically, and an AGI based on that formalization built before any other sort of AGI?
If I was SIAI my reasoning would be the following. First stop with the believes- believes not dichotomy and move to probabilities.
So what is the probability of a good outcome if you can’t formalize friendliness before AGI? Some of them would argue infinitesimal. This is based on fast take-off winner take all type scenarios (I have a problem with this stage, but I would like it to be properly argued and that is hard).
So looking at the decision tree (under these assumptions) the only chance of a good outcome is to try to formalise FAI before AGI becomes well known. All the other options lead to extinction.
So to attack the “formalise Friendliness before AGI” position you would need to argue that the first AGIs are very unlikely to kill us all. That is the major battleground as far as I am concerned.
Agreed about what the “battleground” is, modulo one important nit: not the first AGI, but the first AGI that recursively self-improves at a high speed. (I’m pretty sure that’s what you meant, but it’s important to keep in mind that, e.g., a roughly human-level AGI as such is not what we need to worry about—the point is not that intelligent computers are magically superpowerful, but that it seems dangerously likely that quickly self-improving intelligences, if they arrive, will be non-magically superpowerful.)
I don’t think formalize-don’t formalize should be a simple dichotomy either; friendliness can be formalized in various levels of detail, and the more details are formalized, the fewer unconstrained details there are which could be wrong in a way that kills us all.
I’d look at it the other way: I’d take it as practically certain that any superintelligence built without explicit regard to Friendliness will be unFriendly, and ask what the probability is that through sufficiently slow growth in intelligence and other mere safeguards, we manage to survive building it.
My best hope currently rests on the AGI problem being hard enough that we get uploads first.
(This is essentially the Open Thread about everything Eliezer or SIAI have ever said now, right?)
Uploading would have quite a few benefits, but I get the impression it would make us more vulnerable to whatever tools a hostile AI may possess, not less.
“So what is the probability of a good outcome if you can’t formalize friendliness before AGI? Some of them would argue infinitesimal.”
One problem here is the use of a circular definition of “friendliness”—that defines the concept it in terms of whether it leads to a favourable outcome. If you think “friendly” is defined in terms of whether or not the machine destroys humanity, then clearly you will think that an “unfriendly” machine would destroy the world. However, this is just a word game—which doesn’t tell us anything about the actual chances of such destruction happening.
Let’s say “we” are the good guys in the race for AI. Define
W = we win the race to create an AI powerful enough to protect humanity from any subsequent AIs
G = our AI can be used to achieve a good outcome
F = we go the “formalize friendliness” route
O = we go a promising route other than formalizing friendliness
At issue is which of the following is higher:
P(G|WF)P(W|F) or P(G|WO)P(W|O)
From what I know of SIAI’s approach to F, I estimate P(W|F) to be many orders of magnitude smaller than P(W|O). I estimate P(G|WO) to be more than 1% for a good choice of O (this is a lower bound; my actual estimate of P(G|WO) is much higher, but you needn’t agree with that to agree with my conclusion). Therefore the right side wins.
There are two points here that one could conceivably dispute, but it sounds like the “SIAI logic” is to dispute my estimate of P(G|WO) and say that P(G|WO) is in fact tiny. I haven’t seen SIAI give a convincing argument for that.
My summary would be: there are huge numbers of types of minds and motivations, so if we pick one at random from the space of minds then it likely to be contrary to our values because it will have a different sense of what is good or worthwhile. This moderately relies on the speed/singleton issue, because evolution pressure between AI might force them in the same direction as us. We would likely be out-competed before this happens though, if we rely on competition between AIs.
I think various people associated with SIAI mean different things by formalizing friendliness. I remember Vladimir Nesov means getting better than 50% probability for providing a good outcome.
It doesn’t matter what happens when we sample a mind at random. We only care about the sorts of minds we might build, whether by designing them or evolving them. Either way, they’ll be far from random.
Consider my “at random” short hand for “at random from the space of possible minds built by humans”.
The Eliezer approved example of humans not getting a simple system to do what they want is the classic Machine Learning example where a Neural Net was trained on two different sorts of tanks. It had happened that the photographs of the different types of tanks had been taken at different times of day. So the classifier just worked on that rather than actually looking at the types of tank. So we didn’t build a tank classifier but a day/night classifier. More here.
While I may not agree with Eliezer on everything, I do agree with him it is damn hard to get a computer to do what you want when you stop programming them explicitly .
Obviously AI is hard, and obviously software has bugs.
To counter my argument, you need to make a case that the bugs will be so fundamental and severe, and go undetected for so long, that despite any safeguards we take, they will lead to catastrophic results with probability greater than 99%.
Things like AI boxing or “emergency stop buttons” would be instances of safeguards. Basically any form of human supervision that can keep the AI in check even if it’s not safe to let it roam free.
Are you really suggesting a trial and error approach where we stick evolved and human created AIs in boxes and then eyeball them to see what they are like? Then pick the nicest looking one, on a hunch, to have control over our light cone?
This is why we need to create friendliness before AGI → A lot of people who are loosely familiar with the subject think those options will work!
A goal directed intelligence will work around any obstacles in front of it. It’ll make damn sure that it prevents anyone from pressing emergency stop buttons.
The first AI will be determined by the first programmer, sure. But I wasn’t talking about that level; the biases and concern for the ethics of the AI of that programmer will be random from the space of humans. Or at least I can’t see any reason why I should expect people who care about ethics to be more likely to make AI than those that think economics will constrain AI to be nice,
That is now a completely different argument to the original “there are huge numbers of types of minds and motivations, so if we pick one at random from the space of minds”.
Re: “the biases and concern for the ethics of the AI of that programmer will be random from the space of humans”
Those concerned probably have to be an expert programmers, able to build a company or research group, and attract talented assistance, as well as probably customers. They will probably be far-from what you would get if you chose at “random”.
Do we pick a side of a coin “at random” from the two possibilities when we flip it?
Epistemically, yes, we don’t have sufficient information to predict it*. However if we do the same thing twice it has the same outcome so it is not physically random.
So while the process that decides what the first AI is like is not physically random, it is epistemically random until we have a good idea of what AIs produce good outcomes and get humans to follow those theories. For this we need something that looks like a theory of friendliness, to some degree.
Considering we might use evolutionary methods for part of the AI creation process, randomness doesn’t look like too bad a model.
*With a few caveats. I think it is biased to land the same way up as it was when flipped, due to the chance of making it spin and not flip.
We do have an extensive body of knowledge about how to write computer programs that do useful things. The word “random” seems like a terrible mis-summary of that body of information to me.
As for “evolution” being equated to “randomnness”—isn’t that one of the points that creationists make all the time? Evolution has two motors—variation and selection. The first of these may have some random elements, but it is only one part of the overall process.
I think we have a disconnect on how much we believe proper scary AIs will be like previous computer programs.
My conception of current computer programs is that they are crystallised thoughts plucked from our own minds and easily controllable and unchangeable. When we get interesting AI the programs will morphing and be far less controllable without a good theory of how to control the change.
I shudder every time people say the “AI’s source code” as if it is some unchangeable and informative thing about the AI’s behaviour after the first few days of the AI’s existence.
You have correctly identified the area in which we do not agree.
The most relevant knowledge needed in this case is knowledge of game theory and human behaviour. They also need to know ‘friendliness is a very hard problem’. They then need to ask themselves the following question:
What is likely to happen if people have the ability to create an AGI but do not have a proven mechanism for implementing friendliness? Is it:
Shelve the AGI, don’t share the research and set to work on creating a framework for friendliness. Don’t rush the research—act as if the groundbreaking AGI work that you just created was a mere toy problem and the only real challenge is the friendliness. Spend an even longer period of time verifying the friendliness design and never let on that you have AGI capabilities.
Something else.
What are your reasons for believing that friendliness can be formalized practically, and an AGI based on that formalization built before any other sort of AGI?
I don’t (with that phrasing). I actually suspect that the problem is too difficult to get right and far too easy to get wrong. We’re probably all going to die. However, I think we’re even more likely to die if some fool goes and invents a AGI before they have a proven theory of friendliness.
Those are the people, indeed. But where do the donations come from? EY seems to be using this argument against me as well. I’m just not educated, well-read or intelligent enough for any criticism. Maybe so, I acknowledged that in my post. But have I seen any pointers to how people arrive at their estimations yet? No, just the demand to read all of LW, which according to EY doesn’t even deal with what I’m trying to figure out, but rather the dissolving of biases. A contradiction?
I’m inquiring about the strong claims made by the SIAI, which includes EY and LW. Why? Because they ask for my money and resources. Because they gather fanatic followers who believe into the possibility of literally going to hell. If you follow the discussion surrounding Roko’s posts you’ll see what I mean. And because I’m simply curious and like to discuss, besides becoming less wrong.
But if EY or someone else is going to tell me that I’m just too dumb and it doesn’t matter what I do, think or donate, I can accept that. I don’t expect Richard Dawkins to enlighten me about evolution either. But don’t expect me to stay quiet about my insignificant personal opinion and epistemic state (as you like to call it) either! Although since I’m conveniently not neurotypical (I guess), you won’t have to worry me turning into an antagonist either, simply because EY is being impolite.
SIAI position does dot require “obviously X” from a decision perspective, the opposite one does. To be so sure of something as complicated as the timeline of FAI math vs AGI development seems seriously foolish to me.
It is not a matter about being sure of it but to weigh it against what is asked for in return, other possible events of equal probability and the utility payoff from spending the resources on something else entirely.
I’m not asking the SIAI to prove “obviously X” but rather to prove the very probability of X that they are claiming it has within the larger context of possibilities.
Capa: It’s the problem right there. Between the boosters and the gravity of the sun the velocity of the payload will get so great that space and time will become smeared together and everything will distort. Everything will be unquantifiable.
Kaneda: You have to come down on one side or the other. I need a decision.
Capa: It’s not a decision, it’s a guess. It’s like flipping a coin and asking me to decide whether it will be heads or tails.
Kaneda: And?
Capa: Heads… We harvested all Earth’s resources to make this payload. This is humanity’s last chance… our last, best chance… Searle’s argument is sound. Two last chances are better than one.
Not being able to calculate chances does not excuse one from using their best de-biased neural machinery to make a guess at a range. IMO 50 years is reasonable (I happen to know something about the state of AI research outside of the FAI framework). I would not roll over in surprise if it’s 5 years given state of certain technologies.
(I happen to know something about the state of AI research outside of the FAI framework). I would not roll over in surprise if it’s 5 years given state of certain technologies.
I’m curious, because I like to collect this sort of data: what is your median estimate?
(If you don’t want to say because you don’t want to defend a specific number or list off a thousand disclaimers I completely understand.)
Well it’s clear to me now that formalizing Friendliness with pen and paper is as naively impossible as it would have been for the people of ancient Babylon to actually build a tower that reached the heavens; so if resources are to be spent attempting it, then it’s something that does need to be explicitly argued for.
“By focusing on excessively challenging engineering projects it seems possible that those interested in creating a positive future might actually create future problems – by delaying their projects to the point where less scrupulous rivals beat them to the prize”
It looks to me as though you’ve focused in on one of the weaker points in XiXiDu’s post rather than engaging with the (logically independent) stronger points.
XiXiDu wants to know why he can trust SIAI instead of Charles Stross. Reading the MWI sequence is supposed to tell him that far more effectively than any cute little sentence I could write. The first thing I need to know is whether he read the sequence and something went wrong, or if he didn’t read the sequence.
Well, you’ve picked the weakest of his points to answer, and I put it to you that it was clearly the weakest.
You are right of course that what does or doesn’t show up in Charles Stross’s writing doesn’t constitute evidence in either direction—he’s a professional fiction author, he has to write for entertainment value regardless of what he may or may not know or believe about what’s actually likely or unlikely to happen.
A better example would be e.g. Peter Norvig, whose credentials are vastly more impressive than yours (or, granted, than mine), and who thinks we need to get at least another couple of decades of progress under our belts before there will be any point in resuming attempts to work on AGI. (Even I’m not that pessimistic.)
If you want to argue from authority, the result of that isn’t just tilted against the SIAI, it’s flat out no contest.
If this means “until the theory and practice of machine learning is better developed, if you try to build an AGI using existing tools you will very probably fail” it’s not unusually pessimistic at all. “An investment of $X in developing AI theory will do more to reduce the mean time to AI than $X on AGI projects using existing theory now” isn’t so outlandish either. What was the context/cite?
I don’t have the reference handy, but he wasn’t saying let’s spend 20 years of armchair thought developing AGI theory before we start writing any code (I’m sure he knows better than that), he was saying forget about AGI completely until we’ve got another 20 years of general technological progress under our belts.
Not general technological progress surely, but the theory and tools developed by working on particular machine learning problems and methodologies?
Those would seem likely to be helpful indeed. Better programming tools might also help, as would additional computing power (not so much because computing power is actually a limiting factor today, as because we tend to scale our intuition about available computing power to what we physically deal with on an everyday basis—which for most of us, is a cheap desktop PC—and we tend to flinch away from designs whose projected requirements would exceed such a cheap PC; increasing the baseline makes us less likely to flinch away from good designs).
Here too, it looks like you’re focusing on a weak aspect of his post rather than engaging him. Nobody who’s smart and has read your writing carefully doubts that you’re uncommonly brilliant and that this gives you more credibility than the other singulatarians. But there are more substantive aspects of XiXiDu’s post which you’re not addressing.
Like what? Why he should believe in exponential growth? When by “exponential” he actually means “fast” and no one at SIAI actually advocates for exponentials, those being a strictly Kurzweilian obsession and not even very dangerous by our standards? When he picks MWI, of all things, to accuse us of overconfidence (not “I didn’t understand that” but “I know something you don’t about how to integrate the evidence on MWI, clearly you folks are overconfident”)? When there’s lots of little things scattered through the post like that (“I’m engaging in pluralistic ignorance based on Charles Stross’s nonreaction”) it doesn’t make me want to plunge into engaging the many different little “substantive” parts, get back more replies along the same line, and recapitulate half of Less Wrong in the process. The first thing I need to know is whether XiXiDu did the reading and the reading failed, or did he not do the reading? If he didn’t do the reading, then my answer is simply, “If you haven’t done enough reading to notice that Stross isn’t in our league, then of course you don’t trust SIAI”. That looks to me like the real issue. For substantive arguments, pick a single point and point out where the existing argument fails on it—don’t throw a huge handful of small “huh?”s at me.
Castles in the air. Your claims are based on long chains of reasoning that you do not write down in a formal style. Is the probability of correctness of each link in that chain of reasoning so close to 1, that their product is also close to 1?
I can think of a couple of ways you could respond:
Yes, you are that confident in your reasoning. In that case you could explain why XiXiDu should be similarly confident, or why it’s not of interest to you whether he is similarly confident.
It’s not a chain of reasoning, it’s a web of reasoning, and robust against certain arguments being off. If that’s the case, then we lay readers might benefit if you would make more specific and relevant references to your writings depending on context, instead of encouraging people to read the whole thing before bringing criticisms.
Most of the long arguments are concerned with refuting fallacies and defeating counterarguments, which flawed reasoning will always be able to supply in infinite quantity. The key predictions, when you look at them, generally turn out to be antipredictions, and the long arguments just defeat the flawed priors that concentrate probability into anthropomorphic areas. The positive arguments are simple, only defeating complicated counterarguments is complicated.
“Fast AI” is simply “Most possible artificial minds are unlikely to run at human speed, the slow ones that never speed up will drop out of consideration, and the fast ones are what we’re worried about.”
“UnFriendly AI” is simply “Most possible artificial minds are unFriendly, most intuitive methods you can think of for constructing one run into flaws in your intuitions and fail.”
MWI is simply “Schrodinger’s equation is the simplest fit to the evidence”; there are people who think that you should do something with this equation other than taking it at face value, like arguing that gravity can’t be real and so needs to be interpreted differently, and the long arguments are just there to defeat them.
The only argument I can think of that actually approaches complication is about recursive self-improvement, and even there you can say “we’ve got a complex web of recursive effects and they’re unlikely to turn out exactly exponential with a human-sized exponent”, the long arguments being devoted mainly to defeating the likes of Robin Hanson’s argument for why it should be exponential with an exponent that smoothly couples to the global economy.
One problem I have with your argument here is that you appear to be saying that if XiXiDu doesn’t agree with you, he must be stupid (the stuff about low g etc.). Do you think Robin Hanson is stupid too, since he wasn’t convinced?
If he wasn’t convinced about MWI it would start to become a serious possibility.
I haven’t found the text during a two minute search or so, but I think I remember Robin assigning a substantial probability, say, 30% or so, to the possibility that MWI is false, even if he thinks most likely (i.e. the remaining 70%) that it’s true.
Much as you argued in the post about Einstein’s arrogance, there seems to be a small enough difference between a 30% chance of being false, and a 90% chance of being false, if the latter would imply that Robin was stupid, the former would imply it too.
I suspect that Robin would not actually act-as-if those odds with a gun to his head, and he is being conveniently modest.
Right: in fact he would act as though MWI is certainly false… or at least as though Quantum Immortality is certainly false, which has a good chance of being true given MWI.
No! He will act as if Quantum Immortality is a bad choice, which is true even if QI works exactly as described. ‘True’ isn’t the right kind word to use unless you include a normative conclusion in the description of QI.
Consider the Least Convenient Possible World...
Suppose that being shot with the gun cannot possibly have intermediate results: either the gun fails, or he is killed instantly and painlessly.
Also suppose that given that there are possible worlds where he exists, each copy of him only cares about its anticipated experiences, not about the other copies, and that this is morally the right thing to do… in other words, if he expects to continue to exist, he doesn’t care about other copies that cease to exist. This is certainly the attitude some people would have, and we could suppose (for the LCPW) that it is the correct attitude.
Even so, given these two suppositions, I suspect it would not affect his behavior in the slightest, showing that he would be acting as though QI is certainly false, and therefore as though there is a good chance that MWI is false.
But that is crazy and false, and uses ‘copies’ to in a misleading way. Why would I assume that?
This ‘least convenient possible world’ is one in which Robin’s values are changed according to your prescription but his behaviour is not, ensuring that your conclusion is true. That isn’t the purpose of inconvenient worlds (kind of the opposite...)
Not at all. You are conflating “MWI is false” with a whole different set of propositions. MWI != QS.
Many people in fact have those values and opinions, and nonetheless act in the way I mention (and there is no one who does not so act) so it is quite reasonable to suppose that even if Robin’s values were so changed, his behavior would remain unchanged.
The very reason Robin was brought up (by you I might add) was to serve as an ad absurdum with respect to intellectual disrespect.
In the Convenient World where Robin is, in fact, too stupid to correctly tackle the concept of QS, understand the difference between MWI and QI or form a sophisticated understanding of his moral intuitions with respect to quantum uncertainty this Counterfactual-Stupid-Robin is a completely useless example.
I can imagine two different meanings for “not convinced about MWI”
It refers to someone who is not convinced that MWI is as good as any other model of reality, and better than most.
It refers to someone who is not convinced that MWI describes the structure of reality.
If we are meant to understand the meaning as #1, then it may well indicate that someone is stupid. Though, more charitably, it might more likely indicate that he is ignorant.
If we are meant to understand the meaning as #2, then I think that it indicates someone who is not entrapped by the Mind Projection Fallacy.
What do you mean by belief in MWI? What sort of experiment could settle whether MWI is true or not?
I suspect that a lot of people object to the stuff including copies of humans and other worlds we should care about and hypotheses about consciousness tacitly build on MWI, rather than MWI itself.
From THE EVERETT FAQ:
“Is many-worlds (just) an interpretation?”
http://www.hedweb.com/manworld.htm#interpretation
“What unique predictions does many-worlds make?”
http://www.hedweb.com/manworld.htm#unique
“Could we detect other Everett-worlds?”
http://www.hedweb.com/manworld.htm#detect
I’m (yet) not convinced.
First, the links say that MWI needs a linear quantum theory, and lists therefore the linearity among its predictions. However, linearity is a part of the quantum theory and its mathematical formalism, and nothing specific to MWI. Also, weak non-linearity would be explicable using the language of MWI saying that the different worlds interact a little. I don’t see how testing the superposition principle establishes MWI. A very weak evidence at best.
Second, there is a very confused paragraph about quantum gravity, which, apart from linking to itself, states only that MWI requires gravity to be quantised (without supporting argument) and therefore if gravity is successfully quantised, it forms evidence for MWI. However, nobody doubts that gravity has to be quantised somehow, even hardcore Copenhageners.
The most interesting part is that about the reversible measurement done by an artificial intelligence. As I understand it, it supposes that we construct a machine which could perform measurements in reversed direction of time, for which it has to be immune to quantum decoherence. It sounds interesting, but is also suspicious. I see no way how can we get the information into our brains without decoherence. The argument apparently tries to circumvent this objection by postulating an AI, which is reversible and decoherence-immune, but the AI will still face the same problem when trying to tell us the results. In fact, postulating the need of an AI here seems to be only a tool to make the proposed experiment more obscure and difficult to analyse. We will have a “reversible AI”, therefore miraculously we will detect differences between Copenhagen and MWI.
However, at least there is a link to Deutsch’s article which hopefully explains the experiment in greater detail, so I will read it and edit the comment later.
“Many-worlds is often referred to as a theory, rather than just an interpretation, by those who propose that many-worlds can make testable predictions (such as David Deutsch) or is falsifiable (such as Everett) or by those who propose that all the other, non-MW interpretations, are inconsistent, illogical or unscientific in their handling of measurements”
http://en.wikipedia.org/wiki/Many-worlds_interpretation
None of the tests in that FAQ look to me like they could distinguish MWI from MWI+worldeater. The closest thing to an experimental test I’ve come up with is the following:
Flip a quantum coin. If heads, copy yourself once, advance both copies enough to observe the result, then kill one of the copies. If tails, do nothing.
In a many-worlds interpretation of QM, from the perspective of the experimenter, the coin will be heads with probability 2⁄3, since there are two observers in that case and only one if the coin was tails. In the single-world case, the coin will be heads with probability 1⁄2. So each time you repeat the experiment, you get 0.4 bits of evidence for or against MWI. Unfortunately, this evidence is also non-transferrable; someone else can’t use your observation as evidence the same way you can. And getting enough evidence for a firm conclusion involves a very high chance of subjective death (though it is guaranteed that exactly one copy will be left behind). And various quantum immortality hypotheses screw up the experiment, too.
So it is testable in principle, but the experiment involved more odious than one would imagine possible.
The math works the same in all interpretations, but some experiments are difficult to understand intuitively without the MWI. I usually give people the example of the Elitzur-Vaidman bomb tester where the easy MWI explanation says “we know the bomb works because it exploded in another world”, but other interpretations must resort to clever intellectual gymnastics.
If all interpretations are equivalent with respect to testable outcomes, what makes the belief in any particular interpretation so important? Ease of intuitive understanding is a dangerous criterion to rely on, and a relative thing too. Some people are more ready to accept mental gymnastic than existence of another worlds.
Well, that depends. Have you actually tried to do the mental gymnastics and explain the linked experiment using the Copenhagen interpretation? I suspect that going through with that may influence your final opinion.
cousin_it:
Maybe I’m missing something, but how exactly does this experiment challenge the Copenhagen interpretation more than the standard double-slit stuff? Copenhagen treats “measurement” as a fundamental and irreducible process and measurement devices as special components in each experiment—and in this case it simply says that a dud bomb doesn’t represent a measurement device, whereas a functioning one does, so that they interact with the photon wavefunction differently. The former leaves it unchanged, while the latter collapses it to one arm of the interferometer—eiher its own, in which case it explodes, or the other one, in which case it reveals itself as a measurement device just by the act of collapsing.
As far as I understand, this would be similar to the standard variations on the double-slit experiment where one destroys the interference pattern by placing a particle detector at the exit from one of the holes. One could presumably do a similar experiment with a detector that might be faulty, and conclude that an interference-destroying detector works even if it doesn’t flash when several particles are let through (in cases where they all happen to go through the other hole). Unless I’m misunderstanding something, this would be a close equivalent of the bomb test.
The final conclusion in the bomb test is surely more spectacular, but I don’t see how it produces any extra confusion for Copenhageners compared to the most basic QM experiments.
Frankly, I don’t know what you consider an explanation here. I am quite comfortable with the prediction which the theory gives, and accept that as an explanation. So I never needed mental gymnastics here. The experiment is weird, but it doesn’t seem to me less weird by saying that the information about the bomb’s functionality came from its explosion in the other world.
Fair enough.
This should be revamped into a document introducing the sequences.
Your claims are only anti-predictions relative to science-fiction notions of robots as metal men.
Most possible artificial minds are neither Friendly nor unFriendly (unless you adopt such a stringent definition of mind that artificial minds are not going to exist in my lifetime or yours).
Fast AI (along with most of the other wild claims about what future technology will do, really) falls afoul of the general version of Amdahl’s law. (On which topic, did you ever update your world model when you found out you were mistaken about the role of computers in chip design?)
About MWI, I agree with you completely, though I am more hesitant to berate early quantum physicists for not having found it obvious. For a possible analogy: what do you think of my resolution of the Anthropic Trilemma?
This is quite helpful, and suggests that what I wanted is not a lay-reader summary, but an executive summary.
I brought this up elsewhere in this thread, but the fact that quantum mechanics and gravity are not reconciled suggests that even Schrodinger’s equation does not fit the evidence. The “low-energy” disclaimer one has to add is very weird, maybe weirder than any counterintuitive consequences of quantum mechanics.
It’s not the Schrödinger equation alone that gives rise to decoherence and thus many-worlds. (Read Good and Real for another toy model, the “quantish” system.) The EPR experiment and Bell’s inequality can be made to work on macroscopic scales, so we know that whatever mathematical object the universe will turn out to be, it’s not going to go un-quantum on us again: it has the same relevant behavior as the Schrödinger equation, and accordingly MWI will be the best interpretation there as well.
Speaking of executive summaries, will you offer one for your metaethics?
“There is no intangible stuff of goodness that you can divorce from life and love and happiness in order to ask why things like that are good. They are simply what you are talking about in the first place when you talk about goodness.”
And then the long arguments are about why your brain makes you think anything different.
This is less startling than your more scientific pronouncements. Are there any atheists reading this that find this (or at first found this) very counterintuitive or objectionable?
I would go further, and had the impression from somewhere that you did not go that far. Is that accurate?
I’m a cognitivist. Sentences about goodness have truth values after you translate them into being about life and happiness etc. As a general strategy, I make the queerness go away, rather than taking the queerness as a property of a thing and using it to deduce that thing does not exist; it’s a confusion to resolve, not an existence to argue over.
To be clear, if sentence X about goodness is translated into sentence Y about life and happiness etc., does sentence Y contain the word “good”?
Edit: What’s left of religion after you make the queerness go away? Why does there seem to be more left of morality?
No, nothing, and because while religion does contain some confusion, after you eliminate the confusion you are left with claims that are coherent but false.
I can do that:
Morality is a specific set of values (Or, more precisely, a specific algorithm/dynamic for judging values). Humans happen to be (for various reasons) the sort of beings that value morality as opposed to valuing, say, maximizing paperclip production. It is indeed objectively better (by which we really mean “more moral”/”the sort of thing we should do”) to be moral than to be paperclipish. And indeed we should be moral, where by “should” we mean, “more moral”.
(And moral, when we actually cash out what we actually mean by it seems to translate to a complicated blob of values like happiness, love, creativity, novelty, self determination, fairness, life (as in protecting theirof), etc...)
It may appear that paperclip beings and moral beings disagree about something, but not really. The paperclippers would, once they’ve analyzed what humans actually mean by “moral”, would agree “yep, humans are more moral than us. But who cares about this morality stuff, it doesn’t maximize paperclips!”
Of course, screw the desires of the paperclippers, after all, they’re not actually moral. We really are objectively better (once we think carefully by what we mean by “better”) than them.
(note, “does something or does something not actually do a good job of fulfilling a certain value?” is an objective question. ie, “does a particular action tend to increase the expected number of paperclips?” (on the paperclipper side) or, on our side, stuff like “does a particular action tend to save more lives, increase happiness, increase fairness, add novelty...” etc etc etc is an objective question in that we can extract specific meaning from that question and can objectively (in a way the paperclippers would agree with) judge that. It simply happens to be that we’re the sorts of beings that actually care about the answer to that (as we should be), while the screwy hypothetical paperclippers are immoral and only care about paperclips.
How’s that, that make sense? Or, to summarize the summary, “Morality is objective, and we humans happen to be the sorts of beings that value morality, as opposed to valuing something else instead”
Is morality actually:
a specific algorithm/dynamic for judging values, or
a complicated blob of values like happiness, love, creativity, novelty, self determination, fairness, life (as in protecting theirof), etc.?
If it’s 1, can we say something interesting and non-trivial about the algorithm, besides the fact that it’s an algorithm? In other words, everything can be viewed as an algorithm, but what’s the point of viewing morality as an algorithm?
If it’s 2, why do we think that two people on opposite sides of the Earth are referring to the same complicated blob of values when they say “morality”? I know the argument about the psychological unity of humankind (not enough time for significant genetic divergence), but what about cultural/memetic evolution?
I’m guessing the answer to my first question is something like, morality is an algorithm whose current “state” is a complicated blob of values like happiness, love, … so both of my other questions ought to apply.
Wei_Dai:
You don’t even have to do any cross-cultural comparisons to make such an argument. Considering the insights from modern behavioral genetics, individual differences within any single culture will suffice.
There is no reason to be at all tentative about this. There’s tons of cog sci data about what people mean when they talk about morality. It varies hugely (but predictably) across cultures.
Why are you using algorithm/dynamic here instead of function or partial function? (On what space, I will ignore that issue, just as you have...) Is it supposed to be stateful? I’m not even clear what that would mean. Or is function what you mean by #2? I’m not even really clear on how these differ.
You might have gotten confused because I quoted Psy-Kosh’s phrase “specific algorithm/dynamic for judging values” whereas Eliezer’s original idea I think was more like an algorithm for changing one’s values in response to moral arguments. Here are Eliezer’s own words:
Others have pointed out that this definition is actually quite unlikely to be coherent: people would be likely to be ultimately persuaded by different moral arguments and justifications if they had different experiences and heard arguments in different orders etc.
Yes, see here for an argument to that effect by Marcello and subsequent discussion about it between Eliezer and myself.
I think the metaethics sequence is probably the weakest of Eliezer’s sequences on LW. I wonder if he agrees with that, and if so, what he plans to do about this subject for his rationality book.
This is somewhat of a concern given Eliezer’s interest in Friendliness!
As far as I can understand, Eliezer has promoted two separate ideas about ethics: defining personal morality as a computation in the person’s brain rather than something mysterious and external, and extrapolating that computation into smarter creatures. The former idea is self-evident, but the latter (and, by extension, CEV) has received a number of very serious blows recently. IMO it’s time to go back to the drawing board. We must find some attack on the problem of preference, latch onto some small corner, that will allow us to make precise statements. Then build from there.
But I don’t see how that, by itself, is a significant advance. Suppose I tell you, “mathematics is a computation in a person’s brain rather than something mysterious and external”, or “philosophy is a computation in a person’s brain rather than something mysterious and external”, or “decision making is a computation in a person’s brain rather than something mysterious and external” how much have I actually told you about the nature of math, or philosophy, or decision making?
The linked discussion is very nice.
This is currently at +1. Is that from Yudkowsky?
(Edit: +2 after I vote it up.)
This makes sense in that it is coherent, but it is not obvious to me what arguments would be marshaled in its favor. (Yudkowsky’s short formulations do point in the direction of their justifications.) Moreover, the very first line, “morality is a specific set of values,” and even its parenthetical expansion (algorithm for judging values), seems utterly preposterous to me. The controversies between human beings about which specific sets of values are moral, at every scale large and small, are legendary beyond cliche.
It is a common thesis here that most humans would ultimately have the same moral judgments if they were in full agreement about all factual questions and were better at reasoning. In other words, human brains have a common moral architecture, and disagreements are at the level of instrumental, rather than terminal, values and result from mistaken factual beliefs and reasoning errors.
You may or may not find that convincing (you’ll get to the arguments regarding that if you’re reading the sequences), but assuming that is true, then “morality is a specific set of values” is correct, though vague: more precisely, it is a very complicated set of terminal values, which, in this world, happens to be embedded solely in a species of minds who are not naturally very good at rationality, leading to massive disagreement about instrumental values (though most people do not notice that it’s about instrumental values).
It is? That’s a worry. Consider this a +1 for “That thesis is totally false and only serves signalling purposes!”
I… think it is. Maybe I’ve gotten something terribly wrong, but I got the impression that this is one of the points of the complexity of value and metaethics sequences, and I seem to recall that it’s the basis for expecting humanity’s extrapolated volition to actually cohere.
This whole area isn’t covered all that well (as Wei noted). I assumed that CEV would rely on solving an implicit cooperation problem between conflicting moral systems. It doesn’t appear at all unlikely to me that some people are intrinsically selfish to some degree and their extrapolated volitions would be quite different.
Note that I’m not denying that some people present (or usually just assume) the thesis you present. I’m just glad that there are usually others who argue against it!
That’s exactly what I took CEV to entail.
Now this is a startling claim.
Be more specific!
Maybe it’s true if you also specify “if they were fully capable of modifying their own moral intuitions.” I have an intuition (an unexamined belief? a hope? a sci-fi trope?) that humanity as a whole will continue to evolve morally and roughly converge on a morality that resembles current first-world liberal values more than, say, Old Testament values. That is, it would converge, in the limit of global prosperity and peace and dialogue, and assuming no singularity occurs and the average lifespan stays constant. You can call this naive if you want to; I don’t know whether it’s true. It’s what I imagine Eliezer means when he talks about “humanity growing up together”.
This growing-up process currently involves raising children, which can be viewed as a crude way of rewriting your personality from scratch, and excising vestiges of values you no longer endorse. It’s been an integral part of every culture’s moral evolution, and something like it needs to be part of CEV if it’s going to actually converge.
That’s not plausible. That would be some sort of objective morality, and there is no such thing. Humans have brains, and brains are complicated. You can’t have them imply exactly the same preference.
Now, the non-crazy version of what you suggest is that preferences of most people are roughly similar, that they won’t differ substantially in major aspects. But when you focus on detail, everyone is bound to want their own thing.
Psy-Kosh:
It makes sense in its own terms, but it leaves the unpleasant implication that morality differs greatly between humans, at both individual and group level—and if this leads to a conflict, asking who is right is meaningless (except insofar as everyone can reach an answer that’s valid only for himself, in terms of his own morality).
So if I live in the same society with people whose morality differs from mine, and the good-fences-make-good-neighbors solution is not an option, as it often isn’t, then who gets to decide whose morality gets imposed on the other side? As far as I see, the position espoused in the above comment leaves no other answer than “might is right.” (Where “might” also includes more subtle ways of exercising power than sheer physical coercion, of course.)
That two people mean different things by the same word doesn’t make all questions asked using that word meaningless, or even hard to answer.
If by “castle” you mean “a fortified structure”, while I mean “a fortified structure surrounded by a moat”, who will be right if we’re asked if the Chateau de Gisors is a castle? Any confusion here is purely semantic in nature. If you answer yes and I answer no, we won’t have given two answers to the same question, we’ll have given two answers to two different questions. If Psy-Kosh says that the Chateau de Gisors is a fortified structure but it is not surrounded by a moat, he’ll have answered both our questions.
Now, once this has been clarified, what would it mean to ask who gets to decide whose definition of ‘castle’ gets imposed on the other side? Do we need a kind of meta-definition of castle to somehow figure out what the one true definition is? If I could settle this issue by exercising power over you, would it change the fact that the Chateau de Gisors is not surrounded by a moat? If I killed everyone who doesn’t mean the same thing by the word ‘castle’ than I do, would the sentence “a fortified structure” become logically equivalent to the sentence “a fortified structure surrounded by a moat”?
In short, substituting the meaning of a word for the word tends to make lots of seemingly difficult problems become laughably easy to solve. Try it.
*blinks* how did I imply that morality varies? I thought (was trying to imply) that morality is an absolute standard and that humans simply happen to be the sort of beings that care about the particular standard we call “morality”. (Well, with various caveats like not being sufficiently reflective to be able to fully explicitly state our “morality algorithm”, nor do we fully know all its consequences)
However, when humans and paperclippers interact, well, there will probably be some sort of fight if one doesn’t end up with some sort PD cooperation or whatever. It’s not that paperclippers and humans disagree on anything, it’s simply, well, they value paperclips a whole lot more than lives. We’re sort of stuck with having to act in a way to prevent the hypothetical them from acting on that.
(of course, the notion that most humans seem to have the same underlying core “morality algorithm”, just disagreeing on the implications or such, is something to discuss, but that gets us out of executive summary territory, no?)
Psy-Kosh:
I would say that it’s a crucial assumption, which should be emphasized clearly even in the briefest summary of this viewpoint. It is certainly not obvious, to say the least. (And, for full disclosure, I don’t believe that it’s a sufficiently close approximation of reality to avoid the problem I emphasized above.)
Hrm, fair enough. I thought I’d effectively implied it, but apparently not sufficiently.
(Incidentally… you don’t think it’s a close approximation to reality? Most humans seem to value (to various extents) happiness, love, (at least some) lives, etc… right?)
Different people (and cultures) seem to put very different weights on these things.
Here’s an example:
You’re a government minister who has to decide who to hire to do a specific task. There are two applicants. One is your brother, who is marginally competent at the task. The other is a stranger with better qualifications who will probably be much better at the task.
The answer is “obvious.”
In some places, “obviously” you hire your brother. What kind of heartless bastard won’t help out his own brother by giving him a job?
In others, “obviously” you should hire the stranger. What kind of corrupt scoundrel abuses his position by hiring his good-for-nothing brother instead of the obviously superior candidate?
Okay, I can see how XiXiDu’s post might come across that way. I think I can clarify what I think that XiXiDu is trying to get at by asking some better questions of my own.
What evidence has SIAI presented that the Singularity is near?
If the Singularity is near then why has the scientific community missed this fact?
What evidence has SIAI presented for the existence of grey goo technology?
If grey goo technology is feasible then why has the scientific community missed this fact?
Assuming that the Singularity is near, what evidence is there that SIAI has a chance to lower global catastrophic risk in a nontrivial way?
What evidence is there that SIAI has room for more funding?
“Near”? Where’d we say that? What’s “near”? XiXiDu thinks we’re Kurzweil?
What kind of evidence would you want aside from a demonstrated Singularity?
Grey goo? Huh? What’s that got to do with us? Read Nanosystems by Eric Drexler or Freitas on “global ecophagy”. XiXiDu thinks we’re Foresight?
If this business about “evidence” isn’t a demand for particular proof, then what are you looking for besides not-further-confirmed straight-line extrapolations from inductive generalizations supported by evidence?
You’ve claimed that in your blogging heads divlog with Scott Aaronson that you think that it’s pretty obvious that there will be an AGI within the next century. As far as I know you have not offered a detailed description of the reasoning that led you to this conclusion that can be checked by others.
I see this as significant for the reasons given in my comment here.
I don’t know what the situation is with SIAI’s position on grey goo—I’ve heard people say the SIAI staff believe in nanotechnology having capabilities out of line with the beliefs of the scientific community, but they may have been misinformed. So let’s forget about about questions 3 and 4.
Questions 1, 2, 5 and 6 remain.
You’ve shifted the question from “is SIAI on balance worth donating to” to “should I believe everything Eliezer has ever said”.
The point is that grey goo is not relevant to SIAI’s mission (apart from being yet another background existential risk that FAI can dissolve). “Scientific community” doesn’t normally professionally study (far) future technological capabilities.
My whole point about grey goo has been, as stated, that a possible superhuman AI could use it to do really bad things. That is, I do not see how an encapsulated AI, even a superhuman AI, could pose the stated risks without the use of advanced nanotechnology. Is it going to use nukes, like Skynet? Another question related to the SIAI, regarding advanced nanotechnology, is that if without advanced nanotechnology superhuman AI is at all possible.
I’m shocked how you people misintepreted my intentions there.
If a superhuman AI is possible without advanced nanotechnology, a superhuman AI could just invent advanced nanotechnology and implement it.
Grey goo is only a potential danger in its own right because it’s a way dumb machinery can grow in destructive power (you don’t need to assume AI controlling it for it to be dangerous, at least so goes the story). AGI is not dumb, so it can use something more fitting to precise control than grey goo (and correspondingly more destructive and feasible).
The grey goo example was named to exemplify the speed and sophistication of nanotechnology that would have to be around to either allow an AI to be build in the first place or be of considerable danger.
I consider your comment an expression of personal disgust. No way you could possible misinterpret my original point and subsequent explanation to this extent.
As katydee pointed out, if for some strange reason grey goo is what AI would want, AI will invent grey goo. If you used “grey goo” to refer to the rough level of technological development necessary to produce grey goo, then my comments missed that point.
Illusion of transparency. Since the general point about nanotech seems equally wrong to me, I couldn’t distinguish between the error of making it and making a similarly wrong point about the relevance of grey goo in particular. In general, I don’t plot, so take my words literally. If I don’t like something, I just say so, or keep silent.
If it seems equally wrong, why haven’t you pointed me to some further reasoning on the topic regarding the feasibility of AGI without advanced (grey goo level) nanotechnology? Why haven’t you argued about the dangers of AGI which is unable to make use of advanced nanotechnology? I was inquiring about these issues in my original post and not trying to argue against the scenarios in question.
Yes, I’ve seen the comment regarding the possible invention of advanced nanotechnology by AGI. If AGI needs something that isn’t there it will just pull it out of its hat. Well, I have my doubts that even a superhuman AGI can steer the development of advanced nanotechnology so that it can gain control of it. Sure, it might solve the problems associated with it and send the solutions to some researcher. Then it could buy the stocks of the subsequent company involved with the new technology and somehow gain control...well, at this point we are already deep into subsequent reasoning about something shaky that at the same time is used as evidence of the very reasoning involving it.
To the point: if AGI can’t pose a danger, because its hands are tied, that’s wonderful! Then we have more time to work of FAI. FAI is not about superpowerful robots, it’s about technically understanding what we want, and using that understanding to automate the manufacturing of goodness. The power is expected to come from unbounded automatic goal-directed behavior, something that happens without humans in the system to ever stop the process if it goes wrong.
To the point: if AI can’t pose a danger, because its hands are tied, that’s wonderful! Then we have more time to work of FAI.
Overall I’d feel a lot more comfortable if you just said “there’s a huge amount of uncertainty as to when existential risks will strike and which ones will strike, I don’t know whether or not I’m on the right track in focusing on Friendly AI or whether I’m right about when the Singularity will occur, I’m just doing the best that I can.”
This is largely because of the issue that I raise here
I should emphasize that I don’t think that you’d ever knowingly do something that raised existential risk, I think that you’re a kind and noble spirit. But I do think I’m raising a serious issue which you’ve missed.
Edit: See also these comments
I am looking for the evidence in “supported by evidence”. I am further trying to figure how you anticipate your beliefs to pay rent, what you anticipate to see if explosive recursive self-improvement is possible, and how that belief could be surprised by data.
If you just say, “I predict we will likely be wiped out by badly done AI.”, how do you expect to update on evidence? What would constitute such evidence?
I haven’t done the reading. For further explanation read this comment.
Why do you always and exclusively mention Charles Stross? I need to know if you actually read all of my post.
Because the fact that you’re mentioning Charles Stross means that you need to do basic reading, not complicated reading.
To put my own spin on XiXiDu’s questions: What quality or position does Charles Stross possess that should cause us to leave him out of this conversation (other than the quality ‘Eliezer doesn’t think he should be mentioned’)?
Another vacuous statement. I expected more.
What stronger points are you referring to? It seems to me XiXiDu’s post has only 2 points, both of which Eliezer addressed:
“Given my current educational background and knowledge I cannot differentiate LW between a consistent internal logic, i.e. imagination or fiction and something which is sufficiently based on empirical criticism to provide a firm substantiation of the strong arguments for action that are proclaimed on this site.”
His smart friends/favorite SF writers/other AI researchers/other Bayesians don’t support SIAI.
My point is that your evidence has to stand up to whatever estimations you come up with. My point is the missing transparency in your decision making regarding the possibility of danger posed by superhuman AI. My point is that any form of external peer review is missing and that therefore I either have to believe you or learn enough to judge all of your claims myself after reading hundreds of posts and thousands of documents to find some pieces of evidence hidden beneath. My point is that competition is necessary, that not just the SIAI should work on the relevant problems. There are many other points you seem to be missing entirely.
“Is the SIAI evidence-based, or merely following a certain philosophy?”
Oh, is that the substantive point? How the heck was I supposed to know you were singling that out?
That one’s easy: We’re doing complex multi-step extrapolations argued to be from inductive generalizations themselves supported by the evidence, which can’t be expected to come with experimental confirmation of the “Yes, we built an unFriendly AI and it went foom and destroyed the world” sort. This sort of thing is dangerous, but a lot of our predictions are really antipredictions and so the negations of the claims are even more questionable once you examine them.
If you have nothing valuable to say, why don’t you stay away from commenting at all? Otherwise you could simply ask me what I meant to say, if something isn’t clear. But those empty statements coming from you recently make me question if you’ve been the person that I thought you are. You cannot even guess what I am trying to ask here? Oh come on...
I was inquiring about the supportive evidence at the origin of your complex multi-step extrapolations argued to be from inductive generalizations. If there isn’t any, what difference is there between writing fiction and complex multi-step extrapolations argued to be from inductive generalizations?
What you say here makes sense, sorry for not being more clear earlier. See my list of questions in my response to another one of your comments.
How was Eliezer supposed to answer that, given that XiXiDu stated that he didn’t have enough background knowledge to evaluate what’s already on LW?
Agreed, and I think there’s a pattern here. XiXiDu is asking the right questions about why SIAI doesn’t have wider support. It is because there are genuine holes in its reasoning about the singularity, and SIAI chooses not to engage with serious criticism that gets at those holes. Example (one of many): I recall Shane Legg commenting that it’s not practical to formalize friendliness before anyone builds any form of AGI (or something to that effect). I haven’t seen SIAI give a good argument to the contrary.
Gahhh! The hoard of arguments against that idea that instantly sprang to my mind (with warning bells screeching) perhaps hints at why a good argument hasn’t been given to the contrary (if, in fact, it hasn’t). It just seems so obvious. And I don’t mean that as a criticism of you or Shane at all. Most things that we already understand well seem like they should be obvious to others. I agree that there should be a post making the arguments on that topic either here on LessWrong or on the SIAI website somewhere. (Are you sure there isn’t?)
Edit: And you demonstrate here just why Eliezer (or others) should bother to answer XiXiDu’s questions even if there are some weaknesses in his reasoning.
My point is that Shane’s conclusion strikes me as the obvious one, and I believe many smart, rational, informed people would agree. It may be the case that, for the majority of smart, rational, informed people, there exists an issue X for which they think “obviously X” and SIAI thinks “obviously not X.” To be taken seriously, SIAI needs to engage with the X’s.
I understand your point, and agree that your conclusion is one that many smart, rational people with good general knowledge would share. Once again I concur that engaging with those X’s is important, including that ‘X’ we’re discussing here.
Sounds like we mostly agree. However, I don’t think it’s a question of general knowledge. I’m talking about smart, rational people who have studied AI enough to have strongly-held opinions about it. Those are the people who need to be convinced; their opinions propagate to smart, rational people who haven’t personally investigated AI in depth.
I’d love to hear your take on X here. What are your reasons for believing that friendliness can be formalized practically, and an AGI based on that formalization built before any other sort of AGI?
If I was SIAI my reasoning would be the following. First stop with the believes- believes not dichotomy and move to probabilities.
So what is the probability of a good outcome if you can’t formalize friendliness before AGI? Some of them would argue infinitesimal. This is based on fast take-off winner take all type scenarios (I have a problem with this stage, but I would like it to be properly argued and that is hard).
So looking at the decision tree (under these assumptions) the only chance of a good outcome is to try to formalise FAI before AGI becomes well known. All the other options lead to extinction.
So to attack the “formalise Friendliness before AGI” position you would need to argue that the first AGIs are very unlikely to kill us all. That is the major battleground as far as I am concerned.
Agreed about what the “battleground” is, modulo one important nit: not the first AGI, but the first AGI that recursively self-improves at a high speed. (I’m pretty sure that’s what you meant, but it’s important to keep in mind that, e.g., a roughly human-level AGI as such is not what we need to worry about—the point is not that intelligent computers are magically superpowerful, but that it seems dangerously likely that quickly self-improving intelligences, if they arrive, will be non-magically superpowerful.)
I don’t think formalize-don’t formalize should be a simple dichotomy either; friendliness can be formalized in various levels of detail, and the more details are formalized, the fewer unconstrained details there are which could be wrong in a way that kills us all.
I’d look at it the other way: I’d take it as practically certain that any superintelligence built without explicit regard to Friendliness will be unFriendly, and ask what the probability is that through sufficiently slow growth in intelligence and other mere safeguards, we manage to survive building it.
My best hope currently rests on the AGI problem being hard enough that we get uploads first.
(This is essentially the Open Thread about everything Eliezer or SIAI have ever said now, right?)
Uploading would have quite a few benefits, but I get the impression it would make us more vulnerable to whatever tools a hostile AI may possess, not less.
Re: “My best hope currently rests on the AGI problem being hard enough that we get uploads first.”
Surely a miniscule chance. It would be like Boeing booting up a scanned bird.
“So what is the probability of a good outcome if you can’t formalize friendliness before AGI? Some of them would argue infinitesimal.”
One problem here is the use of a circular definition of “friendliness”—that defines the concept it in terms of whether it leads to a favourable outcome. If you think “friendly” is defined in terms of whether or not the machine destroys humanity, then clearly you will think that an “unfriendly” machine would destroy the world. However, this is just a word game—which doesn’t tell us anything about the actual chances of such destruction happening.
Let’s say “we” are the good guys in the race for AI. Define
W = we win the race to create an AI powerful enough to protect humanity from any subsequent AIs
G = our AI can be used to achieve a good outcome
F = we go the “formalize friendliness” route
O = we go a promising route other than formalizing friendliness
At issue is which of the following is higher:
P(G|WF)P(W|F) or P(G|WO)P(W|O)
From what I know of SIAI’s approach to F, I estimate P(W|F) to be many orders of magnitude smaller than P(W|O). I estimate P(G|WO) to be more than 1% for a good choice of O (this is a lower bound; my actual estimate of P(G|WO) is much higher, but you needn’t agree with that to agree with my conclusion). Therefore the right side wins.
There are two points here that one could conceivably dispute, but it sounds like the “SIAI logic” is to dispute my estimate of P(G|WO) and say that P(G|WO) is in fact tiny. I haven’t seen SIAI give a convincing argument for that.
I’d start here to get an overview.
My summary would be: there are huge numbers of types of minds and motivations, so if we pick one at random from the space of minds then it likely to be contrary to our values because it will have a different sense of what is good or worthwhile. This moderately relies on the speed/singleton issue, because evolution pressure between AI might force them in the same direction as us. We would likely be out-competed before this happens though, if we rely on competition between AIs.
I think various people associated with SIAI mean different things by formalizing friendliness. I remember Vladimir Nesov means getting better than 50% probability for providing a good outcome.
Edited to add my own overview.
It doesn’t matter what happens when we sample a mind at random. We only care about the sorts of minds we might build, whether by designing them or evolving them. Either way, they’ll be far from random.
Consider my “at random” short hand for “at random from the space of possible minds built by humans”.
The Eliezer approved example of humans not getting a simple system to do what they want is the classic Machine Learning example where a Neural Net was trained on two different sorts of tanks. It had happened that the photographs of the different types of tanks had been taken at different times of day. So the classifier just worked on that rather than actually looking at the types of tank. So we didn’t build a tank classifier but a day/night classifier. More here.
While I may not agree with Eliezer on everything, I do agree with him it is damn hard to get a computer to do what you want when you stop programming them explicitly .
Obviously AI is hard, and obviously software has bugs.
To counter my argument, you need to make a case that the bugs will be so fundamental and severe, and go undetected for so long, that despite any safeguards we take, they will lead to catastrophic results with probability greater than 99%.
How do you consider “formalizing friendliness” to be different from “building safeguards”?
Things like AI boxing or “emergency stop buttons” would be instances of safeguards. Basically any form of human supervision that can keep the AI in check even if it’s not safe to let it roam free.
Are you really suggesting a trial and error approach where we stick evolved and human created AIs in boxes and then eyeball them to see what they are like? Then pick the nicest looking one, on a hunch, to have control over our light cone?
I’ve never seen the appeal of AI boxing.
This is why we need to create friendliness before AGI → A lot of people who are loosely familiar with the subject think those options will work!
A goal directed intelligence will work around any obstacles in front of it. It’ll make damn sure that it prevents anyone from pressing emergency stop buttons.
Better than chance? What chance?
Sorry, “Better than chance” is an english phrase than tends to mean more than 50%.
It assumes an even chance of each outcome. I.e. do better than selecting randomly.
Not appropriate in this context, my brain didn’t think of the wider implications as it wrote it.
It’s easy to do better than random. *Pours himself a cup of tea.*
Programmers do not operate by “picking programs at random”, though.
The idea that “picking programs at random” has anything to do with the issue seems just confused to me.
The first AI will be determined by the first programmer, sure. But I wasn’t talking about that level; the biases and concern for the ethics of the AI of that programmer will be random from the space of humans. Or at least I can’t see any reason why I should expect people who care about ethics to be more likely to make AI than those that think economics will constrain AI to be nice,
That is now a completely different argument to the original “there are huge numbers of types of minds and motivations, so if we pick one at random from the space of minds”.
Re: “the biases and concern for the ethics of the AI of that programmer will be random from the space of humans”
Those concerned probably have to be an expert programmers, able to build a company or research group, and attract talented assistance, as well as probably customers. They will probably be far-from what you would get if you chose at “random”.
Do we pick a side of a coin “at random” from the two possibilities when we flip it?
Epistemically, yes, we don’t have sufficient information to predict it*. However if we do the same thing twice it has the same outcome so it is not physically random.
So while the process that decides what the first AI is like is not physically random, it is epistemically random until we have a good idea of what AIs produce good outcomes and get humans to follow those theories. For this we need something that looks like a theory of friendliness, to some degree.
Considering we might use evolutionary methods for part of the AI creation process, randomness doesn’t look like too bad a model.
*With a few caveats. I think it is biased to land the same way up as it was when flipped, due to the chance of making it spin and not flip.
Edit: Oh and no open source AI then?
We do have an extensive body of knowledge about how to write computer programs that do useful things. The word “random” seems like a terrible mis-summary of that body of information to me.
As for “evolution” being equated to “randomnness”—isn’t that one of the points that creationists make all the time? Evolution has two motors—variation and selection. The first of these may have some random elements, but it is only one part of the overall process.
I think we have a disconnect on how much we believe proper scary AIs will be like previous computer programs.
My conception of current computer programs is that they are crystallised thoughts plucked from our own minds and easily controllable and unchangeable. When we get interesting AI the programs will morphing and be far less controllable without a good theory of how to control the change.
I shudder every time people say the “AI’s source code” as if it is some unchangeable and informative thing about the AI’s behaviour after the first few days of the AI’s existence.
I’m not sure how to resolve that difference.
You have correctly identified the area in which we do not agree.
The most relevant knowledge needed in this case is knowledge of game theory and human behaviour. They also need to know ‘friendliness is a very hard problem’. They then need to ask themselves the following question:
What is likely to happen if people have the ability to create an AGI but do not have a proven mechanism for implementing friendliness? Is it:
Shelve the AGI, don’t share the research and set to work on creating a framework for friendliness. Don’t rush the research—act as if the groundbreaking AGI work that you just created was a mere toy problem and the only real challenge is the friendliness. Spend an even longer period of time verifying the friendliness design and never let on that you have AGI capabilities.
Something else.
I don’t (with that phrasing). I actually suspect that the problem is too difficult to get right and far too easy to get wrong. We’re probably all going to die. However, I think we’re even more likely to die if some fool goes and invents a AGI before they have a proven theory of friendliness.
Those are the people, indeed. But where do the donations come from? EY seems to be using this argument against me as well. I’m just not educated, well-read or intelligent enough for any criticism. Maybe so, I acknowledged that in my post. But have I seen any pointers to how people arrive at their estimations yet? No, just the demand to read all of LW, which according to EY doesn’t even deal with what I’m trying to figure out, but rather the dissolving of biases. A contradiction?
I’m inquiring about the strong claims made by the SIAI, which includes EY and LW. Why? Because they ask for my money and resources. Because they gather fanatic followers who believe into the possibility of literally going to hell. If you follow the discussion surrounding Roko’s posts you’ll see what I mean. And because I’m simply curious and like to discuss, besides becoming less wrong.
But if EY or someone else is going to tell me that I’m just too dumb and it doesn’t matter what I do, think or donate, I can accept that. I don’t expect Richard Dawkins to enlighten me about evolution either. But don’t expect me to stay quiet about my insignificant personal opinion and epistemic state (as you like to call it) either! Although since I’m conveniently not neurotypical (I guess), you won’t have to worry me turning into an antagonist either, simply because EY is being impolite.
SIAI position does dot require “obviously X” from a decision perspective, the opposite one does. To be so sure of something as complicated as the timeline of FAI math vs AGI development seems seriously foolish to me.
It is not a matter about being sure of it but to weigh it against what is asked for in return, other possible events of equal probability and the utility payoff from spending the resources on something else entirely.
I’m not asking the SIAI to prove “obviously X” but rather to prove the very probability of X that they are claiming it has within the larger context of possibilities.
No such proof is possible with our machinery.
=======================================================
Capa: It’s the problem right there. Between the boosters and the gravity of the sun the velocity of the payload will get so great that space and time will become smeared together and everything will distort. Everything will be unquantifiable.
Kaneda: You have to come down on one side or the other. I need a decision.
Capa: It’s not a decision, it’s a guess. It’s like flipping a coin and asking me to decide whether it will be heads or tails.
Kaneda: And?
Capa: Heads… We harvested all Earth’s resources to make this payload. This is humanity’s last chance… our last, best chance… Searle’s argument is sound. Two last chances are better than one.
=====================================================
(Sunshine 2007)
Not being able to calculate chances does not excuse one from using their best de-biased neural machinery to make a guess at a range. IMO 50 years is reasonable (I happen to know something about the state of AI research outside of the FAI framework). I would not roll over in surprise if it’s 5 years given state of certain technologies.
I’m curious, because I like to collect this sort of data: what is your median estimate?
(If you don’t want to say because you don’t want to defend a specific number or list off a thousand disclaimers I completely understand.)
Median 15-20 years. I’m not really an expert, but certain technologies are coming really close to modeling cognition as I understand it.
Thanks!
Well it’s clear to me now that formalizing Friendliness with pen and paper is as naively impossible as it would have been for the people of ancient Babylon to actually build a tower that reached the heavens; so if resources are to be spent attempting it, then it’s something that does need to be explicitly argued for.
“By focusing on excessively challenging engineering projects it seems possible that those interested in creating a positive future might actually create future problems – by delaying their projects to the point where less scrupulous rivals beat them to the prize”
http://www.acceleratingfuture.com/michael/blog/2009/12/a-short-introduction-to-coherent-extrapolated-volition-cev/