Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong
Introduction
Edit 8/27: I think the tone of this post was not ideal, changing much of it!
“After many years, I came to the conclusion that everything he says is false. . . . Every one of his arguments was tinged and coded with falseness and pretense. It was like playing chess with extra pieces. It was all fake.”
—Paul Postal (talking about Chomsky) (note, this is not exactly how I feel about Yudkowsky, I don’t think he’s knowingly dishonest, but I just thought it was a good quote and partially represents my attitude towards Yudkowsky).
Crosspost of this on my blog.
In the days of my youth, about two years ago, I was a big fan of Eliezer Yudkowsky. I read his many, many writings religiously, and thought that he was right about most things. In my final year of high school debate, I read a case that relied crucially on the many worlds interpretation of quantum physics—and that was largely a consequence of reading through Eliezer’s quantum physics sequence. In fact, Eliezer’s memorable phrasing that the many worlds interpretation “wins outright given the current state of evidence,” was responsible for the title of my 44-part series arguing for utilitarianism titled “Utilitarianism Wins Outright.” If you read my early articles, you can find my occasional blathering about reductionism and other features that make it clear that my worldview was at least somewhat influenced by Eliezer.
But as I grew older and learned more, I came to conclude that much of what he said was deeply implausible.
Eliezer sounds good whenever he’s talking about a topic that I don’t know anything about. I know nothing about quantum physics, and he sounds persuasive when talking about quantum physics. But every single time he talks about a topic that I know anything about, with perhaps one or two exceptions, what he says is completely unreasonable, at least, when it’s not just advice about how to reason better. It is not just that I always end up disagreeing with him, it is that he says with almost total falsehood after falsehood, making it frequently clear he is out of his depth. And this happens almost every single time. It seems that, with few exceptions, whenever I know anything about a topic that he talks about, it becomes clear that his view is held confidently but very implausible.
Why am I writing a hit piece on Yudkowsky? I certainly don’t hate him. In fact, I’d guess that I agree with him much more than almost all people on earth. Most people believe lots of outrageous falsehoods. And I think that he has probably done more good than harm for the world by sounding the alarm about AI, which is a genuine risk. And I quite enjoy his scrappy, willing-to-be-contrarian personality. So why him?
Part of this is caused by personal irritation. Each time I hear some rationalist blurt out “consciousness is just what an algorithm feels like from the inside,” I lose a year of my life and my blood pressure doubles (some have hypothesized that the explanation for the year of lost life involves the doubling of my blood pressure). And I spend much more time listening to Yukowsky’s followers say things that I think are false than most other people.
But a lot of it is that Yudkowsky has the ear of many influential people. He is one of the most influential AI ethicists around. Many people, my younger self included, have had their formative years hugely shaped by Yudkowsky’s views—on tons of topics. As Eliezer says:
In spite of how large my mistakes were, those two years of blog posting appeared to help a surprising number of people a surprising amount.
Quadratic Rationality expresses a common sentiment, that the sequences, written by Eliezer, have significantly shaped the world view of them and others. Eliezer is a hugely influential thinker, especially among effective altruists, who punch above their weight in terms of influence.
And Eliezer does often offer good advice. He is right that people often reason poorly, and there are ways people can improve their thinking. Humans are riddled by biases, and it’s worth reflecting on how that distorts our beliefs. I thus feel about him much like I do about Jordan Peterson—he provides helpful advice, but the more you listen, the more he sells you on a variety of deeply implausible, controversial views that have nothing to do with the self-help advice.
And the negative effects of Eliezer’s false beliefs have been significant. I’ve heard lots of people describe that they’re not vegan because of Eliezer’s animal consciousness views—views that are utterly nutty, as we’ll see. It is bad that many more people torture sentient beings on account of utterly loony beliefs about consciousness. Many people think that they won’t live to be 40 because they’re almost certain that AI will kill everyone, on account of Eliezer’s reasoning, and deference to Eliezer more broadly. Thinking that we all die soon can’t be good for mental health.
Eliezer’s influence is responsible for a narrow, insular way of speaking among effective altruists. It’s common to hear, at EA globals, peculiar LessWrong speak; something that is utterly antithetical to the goal of bringing new, normal non-nerds into the effective altruism movement. This is a point that I will assert without argument just based on my own sense of things—LessWrong speak masks confusion more than it enables understanding. People feel as though they’ve dissolved the hard problem by simply declaring that consciousness is what an algorithm feels like from the inside.
In addition, Eliezer’s views have undermined widespread trust in experts. They result in people thinking that they know better than David Chalmers about non-physicalism—that clever philosophers of mind are just morons who aren’t smart enough to understand Eliezer’s anti-zombie argument. Eliezer’s confident table pounding about quantum physics leads to people thinking that physicists are morons, incapable of understanding basic arguments. This undermining of trust in genuine authority results in lots of rationalists holding genuinely wacky views—if you think you are smarter than the experts, you are likely to believe crazy things.
Eliezer has swindled many of the smartest people into believing a whole host of wildly implausible things. Some of my favorite writers—e.g. Scott Alexander—seem to revere Eliezer. It’s about time someone pointed out his many false beliefs, the evaluation of which is outside of the normal competency of most people who do not know much about niche . If one of the world’s most influential thinkers is just demonstrably wrong about lots of topics, often in ways so egregious that they demonstrate very basic misunderstandings, then that’s quite newsworthy, just as it would be if a presidential candidate supported a slate of terrible policies.
The aim of this article is not to show that Eliezer is some idiot who is never right about anything. Instead, it is to show that Eliezer, on many topics, including ones where he describes agreeing with his position as being a litmus test for being sane, Eliezer is both immensely overconfident and demonstrably wrong. I think people, when they hear Eliezer express some view about some topic about which they’re unfamiliar, have roughly the following thought process:
Oh jeez, Eliezer thinks that most of the experts who think X are mistaken. I guess I should take seriously the hypothesis that X is wrong and that Eliezer has correctly identified an error in their reasoning. This is especially so given that he sounds convincing when he talks about X.
I think that instead they should have the following thought process:
I’m not an expert about X, but it seems like most of the experts about X think X or are unsure about it. The fact that Eliezer, who often veers sharply off-the-rails, thinks X gives me virtually no evidence about X. Eliezer, while being quite smart, is not rational enough to be worthy of significant deference on any subject, especially those subjects outside his area of expertise. Still though, he has some interesting things to say about AI and consequentialism that are sort of convincing. So it’s not like he’s wrong about everything or is a total crank. But he’s wrong enough, in sufficiently egregious ways, that I don’t really care what he thinks.
Eliezer is ridiculously overconfident and has a mediocre track record
Even the people who like Eliezer think that he’s wildly overconfident about lots of things. This is not without justification. Ben Garfinkel has a nice post on the EA forum running through Eliezer’s many, many mistaken beliefs that he held with very high confidence. Garfinkel suggests:
I think these examples suggest that (a) his track record is at best fairly mixed and (b) he has some tendency toward expressing dramatic views with excessive confidence.
Garfinkel runs through a series of incorrect predictions Eliezer has made. He predicted that nanotech would kill us all by 2010. Now, this was up until about 1999, when he was only about 20. So it’s not as probative as it would be if he made that prediction in 2005, for instance. But . . . still. If a guy has already incorrectly predicted that some technology would probably kill us soon, backed up by a rich array of arguments, and now he is predicting that some technology will kill us soon, backed up by a rich array of arguments, a reasonable inference is that, just like financial speculators who constantly predict recessions, this guy just has a bad habit of overpredicting doom.
I will not spend very much time talking about Eliezer’s views about AI, because they’re outside my area of expertise. But it’s worth noting that lots of people who know a lot about AI seem to think that Eliezer is ridiculously overconfident about AI. Jacob Cannell writes, in a detailed post arguing against Eliezer’s model:
My skill points instead have gone near exclusively towards extensive study of neuroscience, deep learning, and graphics/GPU programming. More than most, I actually have the depth and breadth of technical knowledge necessary to evaluate these claims in detail.
I have evaluated this model in detail and found it substantially incorrect and in fact brazenly naively overconfident.
. . .
Every one of his key assumptions is mostly wrong, as I and others predicted well in advance.
. . .
EY is just completely out of his depth here: he doesn’t seem to understand how the Landauer limit actually works, doesn’t seem to understand that synapses are analog MACs which minimally require OOMs more energy than simple binary switches, doesn’t seem to have a good model of the interconnect requirements, etc.
I am also completely out of my depth here. Not only do I not understand how the Landauer limit works, I don’t even know what it is. But it’s worth noting that a guy who seems to know what he’s talking about thinks that many parts of Eliezer’s model are systematically overconfident, based on relatively egregious error.
Eliezer made many, many more incorrect predictions—let me just run through the list.
In 2001, and possibly later, Eliezer predicted that his team would build superintelligence probably between 2008-2010.
“In the first half of the 2000s, he produced a fair amount of technical and conceptual work related to this goal. It hasn’t ultimately had much clear usefulness for AI development, and, partly on the basis, my impression is that it has not held up well—but that he was very confident in the value of this work at the time.”
Eliezer predicted that AI would quickly go from 0 to 100—that potentially over the course of a day, a single team would develop superintelligence. We don’t yet definitively know that that’s false but it almost certainly is.
There are other issues that are more debatable that Garfinkel highlights, that are probably instances of Eliezer’s errors. For most of those though, I don’t know enough to confidently evaluate them. But the worst part is that he has never acknowledged his mixed forecasting track record, and in fact, frequently acts as though he has a very good forecasting track record. This despite the fact that he often makes relatively nebulous predictions without giving credences, and then just gestures in the direction of having been mostly right about things when pressed about this. For example, he’ll claim that he came out better than Robin Hanson in the AI risk debate they had. Claiming that you were more right than someone, when you had wildly diverging models on a range of topics, is not a precise forecast (and in Eliezer’s case, is quite debatable). As Jotto999 notes:
In other domains, where we have more practice detecting punditry tactics, we would dismiss such an uninformative “track record”. We’re used to hearing Tetlock talk about ambiguity in political statements. We’re used to hearing about a financial pundit like Jim Cramer underperforming the market. But the domain is novel in AI timelines.
Even defenders of Eliezer agree that he’s wildly overconfident. Brian Tomasik, for example, says:
Really smart guy. His writings are “an acquired taste” as one of my friends put it, but I love his writing style, both for fiction and nonfiction. He’s one of the clearest and most enjoyable writers I’ve ever encountered.
My main high-level complaint is that Eliezer is overconfident about many of his beliefs and doesn’t give enough credence to other smart people. But as long as you take him with some salt, it’s fine.
Eliezer is in the top 10 list for people who have changed the way I see the universe.
Scott Alexander in a piece defending Eliezer says:
This is not to say that Eliezer – or anyone on Less Wrong – or anyone in the world – is never wrong or never overconfident. I happen to find Eliezer overconfident as heck a lot of the time.
The First Critical Error: Zombies
The zombie argument is an argument for non-physicalism. It’s hard to give a precise definition of non-physicalism, but the basic idea is that consciousness is non-physical in the sense that is it not reducible to the behavior of fundamental particles. Once you know the way atoms work, you can predict all the facts about chairs, tables, iron, sofas, and plants. Non-physicalists claim that consciousness is non-physical in the sense that it’s not explainable in that traditional way. The consciousness facts are fundamental—just as there are fundamental laws about the ways that particles behave, so too are there fundamental laws that govern that subjective experience arises in response to certain physical arrangements.
Let’s illustrate what a physicalist model of reality would work. Note, this is going to be a very simplistic and deeply implausible physicalist model; the idea is just to communicate the basic concept. Suppose that there are a bunch of blocks that move right every second. Assume these blocks are constantly conscious and consciously think “we want to move right.” A physicalist about this reality would think that to fully specify its goings-on, one would have to say the following:
Every second, every block moves right.
A non-physicalist in contrast might think one of the following two sets of rules specifies reality (the bolded thing is the name of the view):
Epiphenomenalism
Every second, every block moves right
Every second, every block thinks “I’d like to move right.”
Interactionism
Every second, every block thinks “I’d like to move right.”
Every time a block thinks “I’d like to move right,” it moves right.
The physical facts are facts about the way that matter behaves. Physicalists think once you’ve specified the way that matter behaves, that is sufficient to explain consciousness. Consciousness, just like tables and chairs, can be fully explained in terms of the behavior of physical things.
Non-physicalists think that the physicalists are wrong about this. Consciousness is its own separate thing that is not explainable just in terms of the way matter behaves. There are more niche views like idealism and panpsychism that we don’t need to go into, which say that consciousness is either fundamental to all particles or the only thing that exists, so let’s ignore them. The main view about consciousness is called dualism, according to which consciousness is non-physical and there are some psychophysical laws, that result in consciousness when there are particular physical arrangements.
There are broadly two kinds of dualism: epiphenomenalism and interactionism. Interactionism says that consciousness is causally efficacious, so the psychophysical laws describe that particular physical arrangements give rise to particular mental arrangements and also that those mental states cause other physical things. This can be seen in the block case—the psychophysical laws mean that the blocks give rise to particular conscious states that cause some physical things. Epiphenomenalism says the opposite—consciousness causes nothing. It’s an acausal epiphenomenon—the psychophysical laws go only one way. When there is a certain physical state, consciousness arises, but consciousness doesn’t cause anything further.
The zombie argument is an argument for non-physicalism about consciousness. It doesn’t argue for either an epiphenomenalist or interactionist account. Instead, it just argues against physicalism. The basic idea is as follows: imagine any physical arrangement that contains consciousness, for example, the actual world. Surely, we could imagine a world that is physically identical—where all the atoms, quarks, gluons, and such, move the same way—that doesn’t have consciousness. You could imagine an alternative version of me that is the same down to the atom.
Why think such beings are possible? They sure seem possible. I can quite vividly imagine a version of me that continues through its daily goings-on but that lacks consciousness. It’s very plausible that if something is impossible, there should be some reason that it is impossible—there shouldn’t just be brute impossibilities. The reason that married bachelors are impossible is that they require a contradiction—you can’t be both married and unmarried at the same time. But spelling out a contradiction in the zombie scenario has proved elusive.
I find the zombie argument quite convincing. But there are many smart people who disagree with it who are not off their rocker. Eliezer, however, has views on the zombie argument that demonstrate a basic misunderstanding of it—the type that would be cleared up in an elementary philosophy of mind class. In fact, Eliezer’s position on zombies is utterly bizarre; when describing the motivation for zombies, he writes what amounts to amusing fiction, trying to describe the motivation for zombies, but demonstrating that he has no idea what motivates belief in zombies. It would be like a Christian writer writing a thousand words eloquently steelmanning the problem of evil, but summarizing it as “atheists are angry at god because he creates things that they don’t like.”
What Eliezer thinks the zombie argument is (and what it is not)
Eliezer seems to think the zombie argument is roughly the following:
It seems like if you got rid of the world’s consciousness nothing would change because consciousness doesn’t do anything.
Therefore, consciousness doesn’t do anything.
Therefore it’s non-physical.
Eliezer then goes on an extended attack against premise 1. He argues that if it were true that consciousness does something, then you can’t just drain consciousness from the world and not change anything. So the argument for zombies hinges crucially on the assumption that consciousness doesn’t do anything. But he goes on to argue that consciousness does do something. If it didn’t do anything, what are the odds that when we talked about consciousness, our descriptions would match up with our conscious states? This would be a monumental coincidence, like it being the case that there are space aliens who work exactly the way you describe them to work, but your talk is causally unrelated to them—you’re just guessing and they happen to be exactly what you guess. It would be like saying “I believe there is a bridge in San Francisco with such and such dimensions, but the bridge existing has nothing to do with my talk about the bridge.” Eliezer says:
Your “zombie”, in the philosophical usage of the term, is putatively a being that is exactly like you in every respect—identical behavior, identical speech, identical brain; every atom and quark in exactly the same position, moving according to the same causal laws of motion—except that your zombie is not conscious.
It is furthermore claimed that if zombies are “possible” (a term over which battles are still being fought), then, purely from our knowledge of this “possibility”, we can deduce a priori that consciousness is extra-physical, in a sense to be described below; the standard term for this position is “epiphenomenalism”.
(For those unfamiliar with zombies, I emphasize that this is not a strawman. See, for example, the SEP entry on Zombies. The “possibility” of zombies is accepted by a substantial fraction, possibly a majority, of academic philosophers of consciousness.)
Eliezer goes out of his way to emphasize that this is not a strawman. Unfortunately, it is a strawman. Not only that, Eliezer’s own source that he links to to describe how unstrawmanny it is shows that it is a strawman. Eliezer claims that the believers in zombies think consciousness is causally inefficacious and are called epiphenomenalists. But the SEP page he links to says:
True, the friends of zombies do not seem compelled to be epiphenomenalists or parallelists about the actual world. They may be interactionists, holding that our world is not physically closed, and that as a matter of actual fact nonphysical properties do have physical effects.
In fact, David Chalmers, perhaps the world’s leading philosopher of mind, says the same thing when leaving a comment below Eliezer’s post:
Someone e-mailed me a pointer to these discussions. I’m in the middle of four weeks on the road at conferences, so just a quick comment. It seems to me that although you present your arguments as arguments against the thesis (Z) that zombies are logically possible, they’re really arguments against the thesis (E) that consciousness plays no causal role. Of course thesis E, epiphenomenalism, is a much easier target. This would be a legitimate strategy if thesis Z entails thesis E, as you appear to assume, but this is incorrect. I endorse Z, but I don’t endorse E: see my discussion in “Consciousness and its Place in Nature”, especially the discussion of interactionism (type-D dualism) and Russellian monism (type-F monism). I think that the correct conclusion of zombie-style arguments is the disjunction of the type-D, type-E, and type-F views, and I certainly don’t favor the type-E view (epiphenomenalism) over the others. Unlike you, I don’t think there are any watertight arguments against it, but if you’re right that there are, then that just means that the conclusion of the argument should be narrowed to the other two views. Of course there’s a lot more to be said about these issues, and the project of finding good arguments against Z is a worthwhile one, but I think that such an argument requires more than you’ve given us here.
The zombie argument is an argument for any kind of non-physicalism. Eliezer’s response is to argue that one particular kind of non-physicalism is false. That’s not an adequate response, or a response at all. If I argue “argument P means we have to accept views D, E, F, or I, and the response is ‘but view E has some problems’ that just means we should adopt views D, F, or I.”
But okay, what’s the error here? How does Eliezer’s version of the zombie argument differ from the real version? The crucial error is in his construction of premise 1. Eliezer assumes that, when talking about zombies, we are imagining just subtracting consciousness. He points out (rightly) that if consciousness is causally efficacious then if you only subtract consciousness, you wouldn’t have a physically identical world.
But the zombie argument isn’t about what would actually happen in our world if you just eliminated the consciousness. It’s about a physically identical world to ours lacking consciousness. Imagine you think that consciousness causes atoms 1, 2, and 3 to each move. Well then the zombie world would also involve them moving in the same physical way as they do when consciousness moves them. So it eliminates the experience, but it keeps a world that is physically identical.
This might sound pretty abstract. Let’s make it clearer. Imagine there’s a spirit called Casper. Casper does not have a physical body, does not emit light, and is physically undetectable. However, Casper does have conscious experience and has the ability to affect the world. Every thousand years, Casper can think “I really wish this planet would disappear,” and the planet would disappear. Crucially, we could imagine a world physically identical to the world with Casper, that just lacks Casper. This wouldn’t be what you would get if you just eliminated Casper—you’d also need to do something else to copy the physical effects that Casper has. So when writing the laws of nature for the world that copies Casper’s world, you’d also need to specify:
Oh, and also make one planet disappear every few months, specifically, the same ones Casper would have made disappear.
So the idea is that even if consciousness causes things, we could still imagine a physically identical world to the world where consciousness causes the things. Instead, the things would be caused the same physical way as they are with consciousness, but there would be no consciousness.
Thus, Eliezer’s argument fails completely. It is an argument against epiphenomenalism rather than an argument against zombieism. Eliezer thinks those are the same thing, but that is an error that no publishing academic philosopher could make. It’s really a basic error.
And when this is pointed out, Eliezer begins to squirm. For example, when responding to Chalmers’ comment, he says:
It seems to me that there is a direct, two-way logical entailment between “consciousness is epiphenomenal” and “zombies are logically possible”.
If and only if consciousness is an effect that does not cause further third-party detectable effects, it is possible to describe a “zombie world” that is closed under the causes of third-party detectable effects, but lacks consciousness.
Type-D dualism, or interactionism, or what I’ve called “substance dualism”, makes it impossible—by definition, though I hate to say it—that a zombie world can contain all the causes of a neuron’s firing, but not contain consciousness.
You could, I suppose, separate causes into (arbitrary-seeming) classes of “physical causes” and “extraphysical causes”, but then a world-description that contains only “physical causes” is incompletely specified, which generally is not what people mean by “ideally conceivable”; i.e., the zombies would be writing papers on consciousness for literally no reason, which sounds more like an incomplete imagination than a coherent state of affairs. If you want to give an experimental account of the observed motion of atoms, on Type-D dualism, you must account for all causes whether labeled “physical” or “extraphysical”.
. . .
I understand that you have argued that epiphenomenalism is not equivalent to zombieism, enabling them to be argued separately; but I think this fails. Consciousness can be subtracted from the world without changing anything third-party-observable, if and only if consciousness doesn’t cause any third-party-observable differences. Even if philosophers argue these ideas separately, that does not make them ideally separable; it represents (on my view) a failure to see logical implications.
Think back to the Casper example. Some physical effects in that universe are caused by physical things. Other effects in the universe are caused by nonphysical things (just one thing actually, Casper). This is not an arbitrary classification—if you believe that some things are physical and others are non-physical, then the division isn’t arbitrary. On type-D dualism, the consciousness causes things, and so the mirror world would just fill in the causal effects. A world description that contains only physical causes would be completely specified—it specifies all the behavior of the world, all the physical things, and just fails to specify the consciousness.
This is also just such cope! Eliezer spends an entire article saying, without argument, that zombieism = epiphenomenalism, assuming most people will believe him, and then when pressed on it, gives a barely coherent paragraph worth of justification for this false claim. It would be like it I argued against deontology by saying it was necessarily Kantian and arguing Kant was wrong, and then when called out on that by a leading non-Kantian deontologist, concocted some half-hearted justification for why they’re actually equivalent. That’s not being rational.
Even if we pretend, per impossible, that Eliezer’s extra paragraph refutes interactionist zombieism, it is not responsible to go through an entire article claiming that the only view that believes X is view Y, when that’s totally false, and then just later mention when pressed that there’s an argument for why believers in views other than X can’t believe Y.
In which Eliezer, after getting the basic philosophy of mind wrong, calls others stupid for believing in zombies
I think that the last section conclusively establishes that, at the very least, Eliezer’s views on the zombie argument both fail and evince a fundamental misunderstanding of the argument. But the most infuriating thing about this is Eliezer’s repeated insistence that disagreeing with him about zombies is indicative of fundamental stupidity. When explaining why he ignores philosophers because they don’t come to the right conclusions quickly enough, he says:
And if the debate about zombies is still considered open, then I’m sorry, but as Jeffreyssai says: Too slow! It would be one matter if I could just look up the standard answer and find that, lo and behold, it is correct. But philosophy, which hasn’t come to conclusions and moved on from cognitive reductions that I regard as relatively simple, doesn’t seem very likely to build complex correct structures of conclusions.
Sorry—but philosophy, even the better grade of modern analytic philosophy, doesn’t seem to end up commensurate with what I need, except by accident or by extraordinary competence. Parfit comes to mind; and I haven’t read much Dennett, but Dennett does seem to be trying to do the same sort of thing that I try to do; and of course there’s Gary Drescher. If there was a repository of philosophical work along those lines—not concerned with defending basic ideas like anti-zombieism, but with accepting those basic ideas and moving on to challenge more difficult quests of naturalism and cognitive reductionism—then that, I might well be interested in reading.
(Eliezer wouldn’t like Parfit if he read more of him and realized he was a zombie-believing, non-physicalist, non-naturalist moral realist.)
There’s something infuriating about this. Making basic errors that show you don’t have the faintest grasp on what people are arguing about, and then acting like the people who take the time to get Ph.Ds and don’t end up agreeing with your half-baked arguments are just too stupid to be worth listening to is outrageous. And Eliezer repeatedly admonishes the alleged cognitive deficiency of us zombieists—for example:
I also want to emphasize that the “why so confident?” is a straw misquestion from people who can’t otherwise understand why I could be unconfident of many details yet still not take into account the conflicting opinion of people who eg endorse P-zombies.
It also seems to me that this is not all that inaccessible to a reasonable third party, though the sort of person who maintains some doubt about physicalism, or the sort of philosophers who think it’s still respectable academic debate rather than sheer foolishness to argue about the A-Theory vs. B-Theory of time, or the sort of person who can’t follow the argument for why all our remaining uncertainty should be within different many-worlds interpretations rather than slopping over outside, will not be able to access it.
We zombieists are apparently not reasonable third parties, because we can’t grasp Eliezer’s demonstrably fallacious reply to zombies. Being this confident and wrong is a significant mark against one’s reasoning abilities. If you believe something for terrible reasons, don’t update in response to criticisms over the course of decades, and then act like others who don’t agree with you are too stupid to get it, and in fact use that as one of your go-to examples of “things people stupider than I believe that I shouldn’t update on,” that seriously damages your credibility as a thinker. That evinces dramatic overconfidence, sloppiness, and arrogance.
The Second Critical Error: Decision Theory
Eliezer Yudkowsky has a decision theory called functional decision-theory. I will preface this by noting that I know much less about decision theory than I do about non-physicalism and zombies. Nevertheless, I know enough to get why Eliezer’s decision theory fails. In addition, most of this involves quoting people who are much more informed about decision theory than I am.
There are two dominant decision theories, both of which Eliezer rejects. The first is called causal decision theory. It says that when you have multiple actions that you can take, you should take the action that causes the best things. So, for example, if you have two actions, one of which would cause you to get 10 dollars, the other of which would cause you to get five dollars, and the final of which would cause you to get nothing, you should take the first action because it causes you to be richest at the end.
The next popular decision theory is called evidential decision theory. It says you should take the action where after you take that action you’ll expect to have the highest payouts. So in the earlier case, it would also suggest taking the first action because after you take that action, you’ll expect to be five dollars richer than if you take the second action, and ten dollars richer than if you take the third action.
These sound similar, so you might wonder where they come apart. Let me preface this by saying that I lean towards causal decision theory. Here are some cases where they give diverging suggestions:
Newcombe’s problem: there is a very good predictor who guessed whether you’d take two boxes or one box. If you take only one box, you’d take box A. If the guesser predicted that you’d take box A, they put a million dollars in box A. If they predicted you’d take both boxes, they put nothing into box A. In either case, they put a thousand dollars into box B.
Evidential decision theory would say that you should take only one box. Why? Those who take one box almost always get a million dollars, while those who take two boxes almost always get a thousand dollars. Causal decision theory would say you should take two boxes. On causal decision theory, it doesn’t matter whether people who make decisions like you usually end up worse off—what maters is that, no matter whether there is a million dollars in box A, two-boxing will cause you to have a free thousand dollars, and that is good! The causal decision theorist would note that if you had a benevolent friend who could peek into the boxes and then give you advice about what to do, they’d be guaranteed to suggest that you take both boxes. I used to have the intuition that you should one box, but when I considered this upcoming case, I abandoned that intuition.
Smoker’s lesion: suppose that smoking doesn’t actually cause averse health outcomes. However, smokers do have much higher rates of cancer than non-smokers. The reason for that is that many people have a lesion on their lung that both causes them to be much more likely to smoke and more likely to get cancer. So if you know that someone smokes, you should think it much more likely that they’ll get cancer even though smoking doesn’t cause cancer. Suppose that smoking is fun and doesn’t cause any harm. Evidential decision theory would say that you shouldn’t smoke because smoking gives you evidence that you’ll have a shorter life. You should, after smoking, expect your life to be shorter because it gives you evidence that you had a lesion on your lung. In contrast, causal decision theory would instruct you to smoke because it benefits you and doesn’t cause any harm.
Eliezer’s preferred view is called functional decision theory. Here’s my summary (phrased in a maximally Eliezer like way):
Your brain is a cognitive algorithm that outputs decisions in response to external data. Thus, when you take an action like
take one box
that entails that your mental algorithm outputs
take one box
in Newcombe’s problem. You should take actions such that the algorithm that outputs that decision generates higher expected utility than any other cognitive algorithm.
On Eliezer’s view, you should one box, but it’s fine to smoke because whether your brain outputs “smoke” doesn’t affect whether there is a lesion on your lung, so smoking. Or, as the impressively named Wolfgang Schwarz summarizes:
In FDT, the agent should not consider what would happen if she were to choose A or B. Instead, she ought to consider what would happen if the right choice according to FDT were A or B.
You should one box in this case because if FDT told agents to one box, they would get more utility on average than if FDT told agents to two box. Schwarz argues the first problem with the view is that it gives various totally insane recommendations. One example is a blackmail case. Suppose that a blackmailer will, every year, blackmail one person. There’s a 1 in a googol chance that he’ll blackmail someone who wouldn’t give in to the blackmail and a googol-1/googol chance that he’ll blackmail someone who would give in to the blackmail. He has blackmailed you. He threatens that if you don’t give him a dollar, he will share all of your most embarrassing secrets to everyone in the world. Should you give in?
FDT would say no. After all, agents who won’t give in are almost guaranteed to never be blackmailed. But this is totally crazy. You should give up one dollar to prevent all of your worst secrets from being spread to the world. As Schwarz says:
FDT says you should not pay because, if you were the kind of person who doesn’t pay, you likely wouldn’t have been blackmailed. How is that even relevant? You are being blackmailed. Not being blackmailed isn’t on the table. It’s not something you can choose.
Schwarz has another even more convincing counterexample:
Moreover, FDT does not in fact consider only consequences of the agent’s own dispositions. The supposition that is used to evaluate acts is that FDT in general recommends that act, not just that the agent herself is disposed to choose the act. This leads to even stranger results.
Procreation. I wonder whether to procreate. I know for sure that doing so would make my life miserable. But I also have reason to believe that my father faced the exact same choice, and that he followed FDT. If FDT were to recommend not procreating, there’s a significant probability that I wouldn’t exist. I highly value existing (even miserably existing). So it would be better if FDT were to recommend procreating. So FDT says I should procreate. (Note that this (incrementally) confirms the hypothesis that my father used FDT in the same choice situation, for I know that he reached the decision to procreate.)
Schwarz’s entire piece is very worth reading. It exposes various parts of Soares and Yudkowsky’s paper that rest on demonstrable errors. Another good piece that takes down FDT is MacAskill’s post on LessWrong. He starts by laying out the following plausible principle:
Guaranteed Payoffs: In conditions of certainty — that is, when the decision-maker has no uncertainty about what state of nature she is in, and no uncertainty about the utility payoff of each action is — the decision-maker should choose the action that maximises utility.
This is intuitively very obvious. If you know all the relevant facts about how the world is, and one act gives you more rewards than another act, you should take the first action. But MacAskill shows that FDT violates that constraint over and over again.
Bomb.
You face two open boxes, Left and Right, and you must take one of them. In the Left box, there is a live bomb; taking this box will set off the bomb, setting you ablaze, and you certainly will burn slowly to death. The Right box is empty, but you have to pay $100 in order to be able to take it.
A long-dead predictor predicted whether you would choose Left or Right, by running a simulation of you and seeing what that simulation did. If the predictor predicted that you would choose Right, then she put a bomb in Left. If the predictor predicted that you would choose Left, then she did not put a bomb in Left, and the box is empty.
The predictor has a failure rate of only 1 in a trillion trillion. Helpfully, she left a note, explaining that she predicted that you would take Right, and therefore she put the bomb in Left.
You are the only person left in the universe. You have a happy life, but you know that you will never meet another agent again, nor face another situation where any of your actions will have been predicted by another agent. What box should you choose?
The right action, according to FDT, is to take Left, in the full knowledge that as a result you will slowly burn to death. Why? Because, using Y&S’s counterfactuals, if your algorithm were to output ‘Left’, then it would also have outputted ‘Left’ when the predictor made the simulation of you, and there would be no bomb in the box, and you could save yourself $100 by taking Left. In contrast, the right action on CDT or EDT is to take Right.
The recommendation is implausible enough. But if we stipulate that in this decision-situation the decision-maker is certain in the outcome that her actions would bring about, we see that FDT violates Guaranteed Payoffs.
You can read MacAskill’s full post to find even more objections. He shows that Yudkowsky’s view is wildly indeterminate, incapable of telling you what to do, and also involves a broad kind of hypersensitivity, where however one defines “running the same algorithm” becomes hugely relevant, and determines very significant choices in seemingly arbitrary ways. The basic point is that Yudkowsky’s decision theory is totally bankrupt and implausible, in ways that are evident to those who know about decision theory. It is much worse than either evidential or causal decision theory.
The Third Critical Error: Animal Consciousness
(This was already covered here—if you’ve read that article skip this section and control F conclusion.)
Perhaps the most extreme example of an egregious error backed up by wild overconfidence occured in this Facebook debate about animal consciousness. Eliezer Yudkowsky expressed his view that pigs and almost all animals are almost certainly not conscious. Why is this? Well, as he says:
However, my theory of mind also says that the naive theory of mind is very wrong, and suggests that a pig does not have a more-simplified form of tangible experiences. My model says that certain types of reflectivity are critical to being something it is like something to be. The model of a pig as having pain that is like yours, but simpler, is wrong. The pig does have cognitive algorithms similar to the ones that impinge upon your own self-awareness as emotions, but without the reflective self-awareness that creates someone to listen to it.
Okay, so on this view, one needs to have reflective processes in order to be conscious. One’s brain has to model itself to be conscious. This doesn’t sound plausible to me, but perhaps if there’s overwhelming neuroscientific evidence, it’s worth accepting the view. And this view implies that pigs aren’t conscious, so Yudkowsky infers that they are not conscious.
This seems to me to be the wrong approach. It’s actually incredibly difficult to adjudicate between the different theories of consciousness. It makes sense to gather evidence for and against the consciousness of particular creatures, rather than starting with a general theory and using that to solve the problems. If your model says that pigs aren’t conscious, then that seems to be a problem with your model.
Mammals feel pain
I won’t go too in-depth here, but let’s just briefly review the evidence that mammals, at the very least, feel pain. This evidence is sufficiently strong that, as the SEP page on animal consciousness notes, “the position that all mammals are conscious is widely agreed upon among scientists who express views on the distribution of consciousness.” The SEP page references two papers, one by Jaak Panksepp (awesome name!) and the other by Seth, Baars, and Edelman.
Let’s start with the Panksepp paper. They lay out the basic methodology, which involves looking at the parts of the brain that are necessary and sufficient for consciousness. So they see particular brain regions which are active during states when we’re conscious—and particularly correlate with particular mental states—and aren’t active when we’re not conscious. They then look at the brains of other mammals and notice that these features are ubiquitous in mammals, such that all mammals have the things that we know make us conscious in our brains. In addition, they act physically like we do when we’re in pain—they scream, they cry, their heart rate increases when they have a stressful stimulus, they make cost-benefit analyses where they’re willing to risk negative stimuli for greater reward. Sure looks like they’re conscious.
Specifically, they endorse a “psycho-neuro-ethological ‘‘triangulation’’ approach. The paper is filled with big phrases like that. What that means is that they look at various things that happen in the brain when we feel certain emotions. They observe that in humans, those emotions cause certain things—for example, being happy makes us more playful. They then look at mammal brains and see that they have the same basic brain structure, and this produces the same physical reactions—using the happiness example, this would also make the animals more playful. If they see that animals have the same basic neural structures as we do when we have certain experiences and that those are associated with the same physical states that occur when humans have those conscious states, they infer that the animals are having similar conscious states. If our brain looks like a duck’s brain when we have some experience, and we act like ducks do when they are in a comparable brain state, we should guess that ducks are having a similar experience. (I know we’re talking about mammals here, but I couldn’t resist the “looks like a duck, talks like a duck joke.”)
If a pig has a brain state that resembles ours when we are happy, tries to get things that make it happy, and produces the same neurological responses that we do when we’re happy, we should infer that pigs are not mindless automatons, but are, in fact, happy.
They then note that animals like drugs. Animals, like us, get addicted to opioids and have similar brain responses when they’re on opioids. As the authors note “Indeed, one can predict drugs that will be addictive in humans quite effectively from animal studies of desire.” If animals like the drugs that make us happy and react in similar ways to us, that gives us good reason to think that they are, in fact, happy.
They then note that the parts of the brain responsible for various human emotions are quite ancient—predating humans—and that mammals have them too. So, if the things that cause emotions are also present in animals, we should guess they’re conscious, especially when their behavior is perfectly consistent with being conscious. In fact, by running electricity through certain brain regions that animals share, we can induce conscious states in people—that shows that it is those brain states that are causing the various mental states.
The authors then run through various other mental states and show that those mental states are similar between humans and animals—animals have similar brain regions which provoke similar physical responses, and we know that in humans, those brain regions cause specific mental states.
Now, maybe there’s some magic of the human brain, such that in animal brains, the brain regions that cause qualia instead cause causally identical stuff but no consciousness. But there’s no good evidence for that, and plenty against. You should not posit special features of certain physical systems, for no reason.
Moving on to the Seth, Baars, and Edelman paper, they note that there are various features of consciousness, that differentiate conscious states from other things happening in the brain that don’t induce conscious states. They note:
Consciousness involves widespread, relatively fast, low-amplitude interactions in the thalamocortical core of the brain, driven by current tasks and conditions. Unconscious states are markedly different and much less responsive to sensory input or motor plans.
In other words, there are common patterns among conscious states. We can look at a human brain and see that the things that are associated with consciousness produce different neurological markers from the things that aren’t associated with consciousness. Features associated with consciousness include:
Irregular, low-amplitude brain activity: When we’re awake we have irregular low-amplitude brain activity. When we’re not conscious—e.g. in deep comas or anesthesia-induced unconsciousness—irregular, low-amplitude brain activity isn’t present. Mammal brains possess irregular, low-amplitude brain activity.
Involvement of the thalamocortical system: When you damage the thalamocortical system, that deletes part of one’s consciousness, unlike other systems. Mammals also have a thalamocortical system—just like us.
Widespread brain activity: Consciousness induces widespread brain activity. We don’t have that when things induce us not to be conscious, like being in a coma. Mammals do.
The authors note, from these three facts:
Together, these first three properties indicate that consciousness involves widespread, relatively fast, low-amplitude interactions in the thalamocortical core of the brain, driven by current tasks and conditions. Unconscious states are markedly different and much less responsive to sensory input or endogenous activity. These properties are directly testable and constitute necessary criteria for consciousness in humans. It is striking that these basic features are conserved among mammals, at least for sensory processes. The developed thalamocortical system that underlies human consciousness first arose with early mammals or mammal-like reptiles, more than 100 million years ago.
More evidence from neuroscience for animal consciousness:
Something else about metastability that I don’t really understand is also present in humans and animals.
Consciousness involves binding—bringing lots of different inputs together. In your consciousness, you can see the entire world at once, while thinking about things at the same time. Lots of different types of information are processed simultaneously, in the same way. Some explanations involving neural synchronicity have received some empirical support—and animals also have neural synchronicity, so they would also have the same kind of binding.
We attribute conscious experiences as happening to us. But mammals have a similar sense of self. Mammals, like us, process information relative to themselves—so they see a wall and process it relative to them in space.
Consciousness facilitates learning. Humans learn from conscious experiences. In contrast, we do not learn from things that do not impinge on our consciousness. If someone slaps me whenever I scratch my nose (someone does actually—crazy story), I learn not to scratch my nose. In contrast, if someone does a thing that I don’t consciously perceive when I scratch my nose, I won’t learn from it. But animals seem to learn to, and update in response to stimulus, just like humans do—but only when humans are exposed to things that affect their consciousness. In fact, even fish learn.
So there’s a veritable wealth of evidence that at least mammals are conscious. The evidence is less strong for organisms that are less intelligent and more distant from us evolutionarily, but it remains relatively strong for at least many fish. Overturning this abundance of evidence, that’s been enough to convince the substantial majority of consciousness researchers requires a lot of evidence. Does Yudkowsky have it?
Yudkowsky’s view is crazy, and is decisively refuted over and over again
No. No he does not. In fact, as far as I can tell, throughout the entire protracted Facebook exchange, he never adduced a single piece of evidence for his conclusion. The closest that he provides to an argument is the following:
I consider myself a specialist on reflectivity and on the dissolution of certain types of confusion. I have no compunction about disagreeing with other alleged specialists on authority; any reasonable disagreement on the details will be evaluated as an object-level argument. From my perspective, I’m not seeing any, “No, this is a non-mysterious theory of qualia that says pigs are sentient…” and a lot of “How do you know it doesn’t…?” to which the only answer I can give is, “I may not be certain, but I’m not going to update my remaining ignorance on your claim to be even more ignorant, because you haven’t yet named a new possibility I haven’t considered, nor pointed out what I consider to be a new problem with the best interim theory, so you’re not giving me a new reason to further spread probability density.”
What??? The suggestion seems to be that there is no other good theory of consciousness that implies that animals are conscious. To which I’d reply:
We don’t have any good theory about consciousness yet—the data is just too underdetermined. Just as you can know that apples fall when you drop them before you have a comprehensive theory of gravity, so too can you know some things about consciousness, even absent a comprehensive theory.
There are various theories that predict that animals are conscious. For example, integrated information theory, McFadden’s CEMI field theory, various Higher-Order theories, and the global workspace model will probably imply that animals are conscious. Eliezer has no argument to prefer his view to others.
Take the integrated information theory, for example. I don’t think it’s a great view. But at least it has something going for it. It has made a series of accurate predictions about the neural correlates of consciousness. Same with McFadden’s theory. It seems Yudkowsky’s theory has literally nothing going for it, beyond it sounding to Eliezer like a good solution. There is no empirical evidence for it, and, as we’ll see, it produces crazy, implausible implications. David Pearce has a nice comment about some of those implications:
Some errors are potentially ethically catastrophic. This is one of them. Many of our most intensely conscious experiences occur when meta-cognition or reflective self-awareness fails. Thus in orgasm, for instance, much of the neocortex effectively shuts down. Or compare a mounting sense of panic. As an intense feeling of panic becomes uncontrollable, are we to theorise that the experience somehow ceases to be unpleasant as the capacity for reflective self-awareness is lost? “Blind” panic induced by e.g. a sense of suffocation, or fleeing a fire in a crowded cinema (etc), is one of the most unpleasant experiences anyone can undergo, regardless or race or species. Also, compare microelectrode neural studies of awake subjects probing different brain regions; stimulating various regions of the “primitive” limbic system elicits the most intense experiences. And compare dreams – not least, nightmares – many of which are emotionally intense and characterised precisely by the lack of reflectivity or critical meta-cognitive capacity that we enjoy in waking life.
Yudkowsky’s theory of consciousness would predict that during especially intense experiences, where we’re not reflecting, we’re either not conscious or less conscious. So when people orgasm, they’re not conscious. That’s very implausible. Or, when a person is in unbelievable panic, on this view, they become non-conscious or less conscious. Pearce further notes:
Children with autism have profound deficits of self-modelling as well as social cognition compared to neurotypical folk. So are profoundly autistic humans less intensely conscious than hyper-social people? In extreme cases, do the severely autistic lack consciousness’ altogether, as Eliezer’s conjecture would suggest? Perhaps compare the accumulating evidence for Henry Markram’s “Intense World” theory of autism.
Francisco Boni Neto furthers:
many of our most intensely conscious experiences occur when meta-cognition or reflective self-awareness fails. Super vivid, hyper conscious experiences, phenomenic rich and deep experiences like lucid dreaming and ‘out-of-body’ experiences happens when higher structures responsible for top-bottom processing are suppressed. They lack a realistic conviction, specially when you wake up, but they do feel intense and raw along the pain-pleasure axis.
Eliezer just bites the bullet:
I’m not totally sure people in sufficiently unreflective flow-like states are conscious, and I give serious consideration to the proposition that I am reflective enough for consciousness only during the moments I happen to wonder whether I am conscious. This is not where most of my probability mass lies, but it’s on the table.
So when confronted with tons of neurological evidence that shutting down higher processing results in more intense conscious experiences, Eliezer just says that when we think that we have more intense experiences, we’re actually zombies or something? That’s totally crazy. It’s sufficiently crazy that I think I might be misunderstanding him. When you find out that your view says that people are barely conscious or non-conscious when they orgasm or that some very autistic people aren’t conscious, it makes sense to give up the damn theory!
And this isn’t the only bullet Eliezer bites. He admits, “It would not surprise me very much to learn that average children develop inner listeners at age six.” I have memories from before age 6—these memories would have to have been before I was conscious, on this view.
Rob Wiblin makes a good point:
[Eliezer], it’s possible that what you are referring to as an ‘inner listener’ is necessary for subjective experience, and that this happened to be added by evolution just before the human line. It’s also possible that consciousness is primitive and everything is conscious to some extent. But why have the prior that almost all non-human animals are not conscious and lack those parts until someone brings you evidence to the contrary (i.e. “What I need to hear to be persuaded is,”)? That just cannot be rational.
You should simply say that you are a) uncertain what causes consciousness, because really nobody knows yet, and b) you don’t know if e.g. pigs have the things that are proposed as being necessary for consciousness, because you haven’t really looked into it.
I agree with Rob. We should be pretty uncertain. My credences are maybe the following:
92% that at least almost all mammals are conscious.
80% that almost all reptiles are conscious.
60% that fish are mostly conscious.
30% that insects are conscious.
It’s about as likely that reptiles aren’t conscious as insects are. Because consciousness is private—you only know your own—we shouldn’t be very confident about any features of consciousness.
Based on these considerations, I conclude that Eliezer’s view is legitimately crazy. There is, quite literally, no good reason to believe it, and lots of evidence against it. Eliezer just dismisses that evidence, for no good reason, bites a million bullets, and acts like that’s the obvious solution.
Absurd overconfidence
The thing that was most infuriating about this exchange was Eliezer’s insistence that those who disagreed with him were stupid, combined with his demonstration that he had no idea what he was talking about. Condescension and error make an unfortunate combination. He says of the position that pigs, for instance, aren’t conscious:
It also seems to me that this is not all that inaccessible to a reasonable third party, though the sort of person who maintains some doubt about physicalism, or the sort of philosophers who think it’s still respectable academic debate rather than sheer foolishness to argue about the A-Theory vs. B-Theory of time, or the sort of person who can’t follow the argument for why all our remaining uncertainty should be within different many-worlds interpretations rather than slopping over outside, will not be able to access it.
Count me in as a person who can’t follow any arguments about quantum physics, much less the arguments for why we should be almost certain of many worlds. But seriously, physicalism? We should have no doubt about physicalism? As I’ve argued before, the case against physicalism is formidable. Eliezer thinks it’s an open-and-shut case, but that’s because he is demonstrably mistaken about the zombie argument against physicalism and the implications of non-physicalism.
And that’s not the only thing Eliezer expresses insane overconfidence about. In response to his position that most animals other than humans aren’t conscious, David Pearce points out that you shouldn’t be very confident in positions that almost all experts disagree with you about, especially when you have a strong personal interest in their view being false. Eliezer replies:
What do they think they know and how do they think they know it? If they’re saying “Here is how we think an inner listener functions, here is how we identified the associated brain functions, and here is how we found it in animals and that showed that it carries out the same functions” I would be quite impressed. What I expect to see is, “We found this area lights up when humans are sad. Look, pigs have it too.” Emotions are just plain simpler than inner listeners. I’d expect to see analogous brain areas in birds.
When I read this, I almost fell out of my chair. Eliezer admits that he has not so much as read the arguments people give for widespread animal consciousness. He is basing his view on a guess of what they say, combined with an implausible physical theory for which he has no evidence. This would be like coming to the conclusion that the earth is 6,000 years old, despite near-ubiquitous expert disagreement, providing no evidence for the view, and then admitting that you haven’t even read the arguments that experts give in the field against your position. This is the gravest of epistemic sins.
Conclusion
This has not been anywhere near exhaustive. I haven’t even started talking about Eliezer’s very implausible views about morality (though I might write about that too—stay tuned), reductionism, modality, or many other topics. Eliezer usually has a lot to say about topics, and it often takes many thousands of words to refute what he’s saying.
I hope this article has shown that Eliezer frequently expresses near certainty on topics that he has a basic ignorance about, an ignorance so profound that he should suspend judgment. Then, infuriatingly, he acts like those who disagree with his errors are morons. He acts like he is a better decision theorist than the professional decision theorists, a better physicist than the physicists, a better animal consciousness researcher than the animal consciousness researchers, and a much better philosopher of mind than the leading philosophers of mind.
My goal in this is not to cause people to stop reading Eliezer. It’s instead to encourage people to refrain from forming views on things he says just from reading him. It’s to encourage people to take his views with many grains of salt. If you’re reading something by Eliezer and it seems too obvious, on a controversial issue, there’s a decent chance you are being duped.
I feel like there are two types of thinkers, the first we might call innovators and the second systematizers. Innovators are the kinds of people who think of wacky, out-of-the-box ideas, but are less likely to be right. They enrich the state of discourse by being clever, creative, and coming up with new ideas, rather than being right about everything. A paradigm example is Robin Hanson—no one feels comfortable just deferring to Robin Hanson across the board, but Robin Hanson has some of the most ingenious ideas.
Systematizers, in contrast, are the kinds of people who reliably generate true beliefs on lots of topics. A good example is Scott Alexander. I didn’t research Ivermectin, but I feel confident that Scott’s post on Ivermectin is at least mostly right.
I think people think of Eliezer as a systematizer. And this is a mistake, because he just makes too many errors. He’s too confident about things he’s totally ignorant about. But he’s still a great innovator. He has lots of interesting, clever ideas that are worth hearing out. In general, however, the fact that Eliezer believes something is not especially probative. Eliezer’s skill lies in good writing and ingenious argumentation, not forming true beliefs.
- Contra Yudkowsky on Epistemic Conduct for Author Criticism by 13 Sep 2023 15:33 UTC; 69 points) (
- Reflexive decision theory is an unsolved problem by 17 Sep 2023 14:15 UTC; 40 points) (
- Actually, “personal attacks after object-level arguments” is a pretty good rule of epistemic conduct by 17 Sep 2023 20:25 UTC; 37 points) (
- The omnizoid—Heighn FDT Debate #1: Why FDT Isn’t Crazy by 4 Sep 2023 12:57 UTC; 24 points) (
I appreciate the object-level responses this post made and think it’s good to poke at various things Eliezer has said (and also think Eliezer is wrong about a bunch of stuff, including the animal consciousness example in the post). In contrast, I find the repeated assertions of “gross overconfidence” and associated snarkiness annoying, and in many parts of the post the majority of the text seems to be dedicated to repeated statements of outrage with relatively little substance (Eliezer also does this sometimes, and I also find it somewhat annoying in his case, though I haven’t seen any case where he does it this much).
I spent quite a lot of time thinking about all three of these questions, and I currently think the arguments this post makes seem to misunderstand Eliezer’s arguments for the first two, and also get the wrong conclusions on both of them.
For the third one, I disagree with Eliezer, but also, it’s a random thing that Eliezer has said once on Facebook and Twitter, that he hasn’t argued for. Maybe he has good arguments for it, I don’t know. He never claimed anyone else should be convinced by the things he has written up, and I personally don’t understand consciousness or human values well enough to have much of any confident take here. My current best guess is that Eliezer is wrong here, and I would be interested in him seeing him write up his takes, but most of the relevant section seems to boil down to repeatedly asserting that Eliezer has made no arguments for his position, when like, yeah, that’s fine, I don’t see that as a problem. I form most of my beliefs without making my arguments legible to random people on the internet.
Yeah I can see how that could be annoying. In my defense, however, I am seriously irritated by this and I think there’s nothing wrong with being a big snarky sometimes. Eliezer seemed to think in this FaceBook exchange that his view just falls naturally from understanding consciousness. But that is a very specific and implausible model.
I would be interested in your actual defense of the first two sections. It seems the OP went to great lengths to explain exactly where Eliezer went wrong, and contrasted Eliezer’s beliefs with citations to actual, respected domain level experts.
I also do not understand your objection to the term “gross overconfidence”. I think the evidence provided by the OP is completely sufficient to substantiate this claim. In all three cases (and many more I can think of that are not mentioned here), Eliezer has stated things that are probably incorrect, and then dismissively attacked, in an incredibly uncharitable manner, people who believe the opposite claims. “Eliezer is often grossly overconfident” is, in my opinion, a true claim that has been supported with evidence. I do not think charitability requires one to self-censor such a statement.
For the first one, I found Eliezer’s own response reasonable comprehensive.
For the second one, I feel like this topic has been very extensively discussed on the site, and I don’t really want to reiterate all of that discussion. See the FDT tag.
Eliezers response is not comprehensive. He responds to two points (a reasonable choice), but he responds badly, first with a strawman, second with an argument that is probably wrong.
The first point he argues is about brain efficiency, and is not even a point made by the OP. The OP was simply citing someone else, to show that “Eliezer is overconfident about my area of expertise” is an extremely common opinion. It feels very weird to attack the OP over citing somebody else’s opinion.
Regardless, Eliezer handles this badly anyway. Eliezer gives a one paragraph explanation of why brain efficiency is not close to tha Landauer limit. Except that If we look at the actual claim that is quoted, Jacob is not saying that it is at the limit, only that it’s not six orders of magnitude away from the limit, which was Eliezer’s original claim. So essentially he debunks a strawman position and declares victory. (I do not put any trust in Eliezers opinions on neuroscience)
When it comes to the zombies, I’ll admit to finding his argument fairly hard to follow. The accusation levelled against him, both by the OP and Chalmers, is that he falsely equates debunking epiphenomenalism with debunking the zombie argument as a whole.
Eliezer unambiguously does equate the two things, as proven by the following quote highlighted by the OP:
The following sentence, from the comment, seems (to me) to be a contradiction of his earlier claim.
The most likely explanation, to me, is that Eliezer made a mistake, the OP and Chalmers pointed it out, and then he tried to pretend it didn’t happen. I’m not certain this is what happened (as the zombies stuff is highly confusing), but it’s entirely in line with Eliezer’s behavior over the years.
I think Eliezer has a habit of barging into other peoples domains, making mistakes, and then refusing to be corrected by people that actually know what they are talking about, acting rude and uncharitable in the process.
Imagine someone came up to you on the street and claimed to know better than the experts in quantum physics, and nanoscience, and AI research, and ethics, and philosophy of mind, and decision theory, and economic theory, and nutrition, and animal consciousness, and statistics and philosophy of science, and epistemology and virology and cryonics.
What odds would you place on such a person being overconfident about their own abilities?
This post seems mostly wrong and mostly deceptive. You start with this quote:
This is correctly labelled as being about someone else, but is presented as though it’s making the same accusation, just against a different person. But this is not the accusation you go on to make; you never once accuse him of lying. This sets the tone, and I definitely noticed what you did there.
As for the concrete disagreements you list: I’m quite confident you’re wrong about the bottom line regarding nonphysicalism (though it’s possible his nosology is incorrect, I haven’t looked closely at that). I think prior to encountering Eliezer’s writing, I would have put nonphysicalism in the same bucket as theism (ie, false, for similar reasons), so I don’t think Eliezer is causally upstream of me thinking that. I’m also quite confident that you’re wrong about decision theory, and that Eliezer is largely correct. (I estimate Eliezer is responsible for about 30% of the decision-theory-related content I’ve read). On the third disagreement, regarding animal consciousness, it looks likevalues question paired with word games, I’m not sure there’s even a concrete thing (that isn’t a definition) for me to agree or disagree with.
Did you read the next sentence? The next sentence is ” (note, this is not exactly how I feel about Yudkowsky, I don’t think he’s knowingly dishonest, but I just thought it was a good quote and partially represents my attitude towards Yudkowsky).” The reason I included the quote was that it expressed how I feel about Yud minus the lying part—every time I examine one of his claims in detail, it almost always turns out false, often egregiously so.
I don’t think that arguments about whether animals are conscious are value questions. They are factual questions—do animals have experience. Is there something it’s like to be them?
I would have to agree with the parent, why present your writing in such a way that is almost guaranteed to turn away, or greatly increase the skepticism of, serious readers?
A von-Neumann-like character might have been able to get away with writing in this kind of style, and still present some satisfactory piece, but hardly anyone less competent.
It is some months later so I am writing this with the benefit of hindsight, but it seems almost self-negating.
Especially since a large portion of the argument rests on questions regarding Yudowsky’s personal writing style, character, personality, world view, etc., which therefore draw into sharp contrasts the same attributes of any writer calling those out.
i.e. even if every claim regarding Yudowsky’s personal failings turns out to be 100% true, that would still require someone somewhat better in those respects to actually gain the sympathy of the audience.
I didn’t attack his character, I said he was wrong about lots of things.
Did you skim or skip over reading most of the comment?
They are factual questions about high-level concepts (in physicalism, of course) and high-level concepts depend on values—without values even your experiences at one place are not the same things as your experiences in another place.
You say
But then you go on to talk about a bunch of philosophy & decision theory questions that no one has actual “expertise” in, except the sort that comes from reading other people talk about the thing. I was hoping Eliezer had said something about say, carpentry that you disagreed with, because then the dispute would be much more obvious and concrete. As it stands I disagree with your reasoning on the sample of questions I scanned and so it seems to me like this is sufficient to explain the dispute.
I can sympathize with the frustration, and I also think the assertion that EY has “No idea what he is talking about” is too strong. He argues his positions publicly such that detailed rebuttals can be made at all, which is a bar the vast majority of intellectuals fail at.
Edit: I reread the FDT rebuttal for the first time since it came out just to check, and I find it as unconvincing now as then. The author doesn’t grasp the central premise of FDT, that you are optimizing not just for your present universe but across all agents (many worlds or not) who implement the same decision function you are implementing.
The fact that someone argues his positions publicly doesn’t make it so that they necessarily have an idea what they’re talking about. Deepak Chopra argues his positions publicly.
and Deepak deserves praise for that even if his positions are wrong.
I agree! Eliezer deserves praise for writing publicly about his ideas. My article never denied that. It merely claimed that he often confidently says things that are totally wrong.
I suppose he gets one cheer for arguing publically, but for the full three cheers he also needs to listen, and update occasionally. People who disagree with him have a very different view if his rationality to those who don’t.
First, the sentiment
is widely shared. This local version of the Gell-Mann amnesia effect is very pervasive here.
It is also widely acknowledged even by people who respect Eliezer that he is wildly overconfident (and not very charitable to different views).
However, the examples you pick are kind of… weak? If you argue for non-physicalism being a viable model of reality, you are fighting not just Eliezer, but a lot of others who do not suffer from Eliezer’s shortcomings. If you think that two-boxing in the classic Newcomb’s has merit, you are definitely not in a good company, at least not financially. Even your arguments related to animal consciousness are meh. Pain and suffering are two different things. There are many different definitions of consciousness, and Eliezer’s seems to be the one close to self-awareness (an internal narrator).
There are better arguments. Sean Carroll, a very outspoken Everettian, has described why he picks this model, but also explicitly describes what experiments would change his mind. If you talk to actual professional probabilists, they will explain why there is no contradiction between Bayesian and frequentist methods. They will also carefully and explicitly acknowledge where Eliezer is right and where his dilettantism makes him go off the rails.
Ironically, you fall into the same trap you accuse Eliezer of: being wrong yet dismissive and overconfident. And also kind of rude. I understand the anguish you felt once you escaped the near-hypnotic state that many people experience when reading the sequences. There are better ways to get your point across.
That said, I think your conclusion actually make a lot of sense. The sequences are really good at systematizing, clarifying and simplifying. There are a number of original ideas, too, but that is not the main appeal. I definitely approve your caution that
Notably, about three quarters of decision theorists two box. I wasn’t arguing for non-physicalism so much as arguing that Eliezer’s specific argument against physicalism shows that he doesn’t know what he’s talking about. Pain is a subset of suffering—it’s the physical version of suffering, but the same argument can be made for suffering. I didn’t comment on Everetianism because I don’t know enough (just that I think it’s suspicious that Eliezer is so confident) nor on probability theory. I didn’t claim there was a contradiction between Bayesian and frequentist methods.
I know… and I cannot wrap my head around it. They talk about causality and dominant strategies, and end up assigning non-zero weight to a zero-probability possible world. It’s maddening.
I see. Not very surprising given the pattern. I guess my personal view is that non-physicalism is uninteresting given what we currently know about the world, but I am not a philosopher.
Right, my point is that is where a critique of Eliezer’s writings (and exposing his overconfidence and limited knowledge) would be a lot stronger.
Not directly related, but:
Pain can be fun! If your mindset is right. it can also alleviate suffering, physical and especially emotional. But yes, there is a big overlap, and the central example of pain is something that causes suffering.
Can you please explain the “zero-probability possible world”?
There is no possible world with a perfect predictor where a two-boxer wins without breaking the condition of it being perfect.
But there is no possible world with a perfect predictor, unless it has a perfect track record by chance. More obviously, there is no possible world in which we can deduce, from a finite number of observations, that a predictor is perfect. The Newcomb paradox requires the decider to know, with certainty, that Omega is a perfect predictor. That hypothesis is impossible, and thus inadmissible; so any argument in which something is deduced from that fact is invalid.
The argument goes through on probabilities of each possible world, the limit toward perfection is not singular. given the 1000:1 reward ratio, for any predictor who is substantially better than chance once ought to one-box to maximize EV. Anyway, this is an old argument where people rarely manage to convince the other side.
Take a possible world in which the predictor is perfect (meaning: they were able to make a prediction, and there was no possible extension of that world’s trajectory in which what I will actually do deviates from what they have predicted). In that world, by definition, I no longer have a choice. By definition I will do what the predictor has predicted. Whatever has caused what I will do lies in the past of the prediction, hence in the past of the current time point. There is no point in asking myself now what I should do as I have no longer causal influence on what I will do. I can simply relax and watch myself doing what I have been caused to do some time before. I can of course ask myself what might have caused my action and try to predict myself from that what I will do. If I come to believe that it was myself who decided at some earlier point in time what I will do, then I can ask myself what I should have decided at that earlier point in time. If I believe that at that earlier point in time I already knew that the predictor would act in the way it did, and if I believe that I have made the decision rationally, then I should conclude that I have decided to one-box.
The original version of Newcomb’s paradox in Nozick 1969 is not about a perfect predictor however. It begins with (1) “Suppose a being in whose power to predict your choices you have enormous confidence.… You know that this being has often correctly predicted your choices in the past (and has never, so far as you know, made an incorrect prediction about your choices), and furthermore you know that this being has often correctly predicted the choices of other people, many of whom are similar to you, in the particular situation to be described below”. So the information you are given is explicitly only about things from the past (how could it be otherwise). It goes on to say (2) “You have a choice between two actions”. Information (2) implies that what I will do has not been decided yet and I still have causal influence on what I will do. Hence the information what I will do cannot have been available to the predictor. This implies that the predictor cannot have made a perfect prediction about my behaviour. Indeed nothing in (1) implies that they have, the information given is not about my future action at all. After I will have made my decision, it might turn out, of course, that it happens to coincides with what the predictor has predicted. But that is irrelevant for my choice as it would only imply that the predictor will have been lucky this time. What should I make of information (1)? If I am confident that I still have a choice, that question is of no significance for the decision problem at hand and I should two-box. If I am confident that I don’t have a choice but have decided already, the reasoning of the previous paragraph applies and I should hope to observe that I will one-box.
What if I am unsure whether or not I still have a choice? I might have the impression that I can try to move my muscles this way or that way, without being perfectly confident that they will obey. What action should I then decide to try? I should decide to try two-boxing. Why? Because that decision is the dominant strategy: if it turns out that indeed I can decide my action now, then we’re in a world where the predictor was not perfect but merely lucky and in that world two-boxing is dominant; if it instead turns out that I was not able to override my earlier decision at this point, then we’re in a world where what I try now makes no difference. In either case, trying to two-box is undominated by any other strategy.
Sorry, could not reply due to rate limit.
In reply to your first point, I agree, in a deterministic world with perfect predictors the whole question is moot. I think we agree there.
Also, yes, assuming “you have a choice between two actions”, what you will do has not been decided by you yet. Which is different from “Hence the information what I will do cannot have been available to the predictor.” If the latter statement is correct, then how can could have “often correctly predicted the choices of other people, many of whom are similar to you, in the particular situation”? Presumably some information about your decision-making process is available to the predictor in this particular situation, or else the problem setup would not be possible, would it? If you think that you are a very special case, and other people like you are not really like you, then yes, it makes sense to decide that you can get lucky and outsmart the predictor, precisely because you are special. If you think that you are not special, and other people in your situation thought the same way, two-boxed and lost, then maybe your logic is not airtight and your conclusion to two-box is flawed in some way that you cannot quite put your finger on, but the experimental evidence tells you that it is. I cannot see a third case here, though maybe I am missing something. Either you are like others, and so one-boxing gives you more money than two boxing, or you are special and not subject to the setup at all, in which case two-boxing is a reasonable approach.
Right, that is, I guess, the third alternative: you are like other people who lost when two-boxing, but they were merely unlucky, the predictor did not have any predictive powers after all. Which is a possibility: maybe you were fooled by a clever con or dumb luck. Maybe you were also fooled by a clever con or dumb luck when the predictor “has never, so far as you know, made an incorrect prediction about your choices”. Maybe this all led to this moment, where you finally get to make a decision, and the right decision is to two-box and not one-box, leaving money on the table.
I guess in a world where your choice is not predetermined and you are certain that the predictor is fooling you or is just lucky, you can rely on using the dominant strategy, which is to two-box.
So, the question is, what kind of a world you think you live in, given Nozick’s setup? The setup does not say it explicitly, so it is up to you to evaluate the probabilities (which also applies to a deterministic world, only your calculation would also be predetermined).
What would a winning agent do? Look at other people like itself who won and take one box, or look at other people ostensibly like itself and who nevertheless lost and two-box still?
I know what kind of an agent I would want to be. I do not know what kind of an agent you are, but my bet is that if you are the two-boxing kind, then you will lose when push comes to shove, like all the other two-boxers before you, as far as we both know.
There’s many possible explanations for this data. Let’s say I start my analysis with the model that the predictor is guessing, and my model attaches some prior probability for them guessing right in a single case. I might also have a prior about the likelihood of being lied about the predictor’s success rate, etc. Now I make the observation that I am being told the predictor was right every single time in a row. Based on this incoming data, I can easily update my beliefs about what happened in the previous prediction excercises: I will conclude that (with some credence) the predictor was guessed right in each individual case or that (also with some credence) I am being lied to about their prediction success. This is all very simple Bayesian updating, no problem at all. As long as my prior beliefs assign nonzero credence to the possibility that the predictor guesses right (and I see not reason why that shouldn’t be a possibility), I don’t need to assign any posterior credence to the (physically impossible) assumption that they could actually foretell the actions.
Right! If I understand your point correctly, given a strong enough prior for the predictor being lucky or deceptive, it would have to be a lot of evidence to change one’s mind, and the evidence would have to be varied. This condition is certainly not satisfied by the original setup. If your extremely confident prior is that foretelling one’s actions is physically impossible, then the lie/luck hypothesis would have to be much more likely than changing your mind about physical impossibility. That makes perfect sense to me.
I guess one would want to simplify the original setup a bit. What if you had full confidence that the predictor is not a trickster? Would you one-box or two-box? To get the physical impossibility out of the way, they do not necessarily have to predict every atom in your body and mind, just observe you (and read your LW posts, maybe) to Sherlock-like make a very accurate conclusion about what you would decide.
Another question: what kind of experiment, in addition to what is in the setup, would change your mind?
Eliezer replied on the EA Forum
I don’t think your arguments support your conclusion. I think the zombies section mostly shows that Eliezer is not good at telling what his interlocutors are trying to communicate, the animal consciousness bit shows that he’s overconfident, but I don’t think you’ve shown animals are conscious, so doesn’t show he’s frequently confidently egregiously wrong, and your arguments against FDT seem lacking to me, and I’d tentatively say Eliezer is right about that stuff. Or at least, FDT is closer to the best decision theory than than CDT or EDT.
I think Eliezer is often wrong, and often overconfident. It would be interesting to see someone try to compile a good-faith track record of his predictions, perhaps separated by domain of subject.
This seems like one among a line of similar posts I’ve seen recently, of which you’ve linked to many in your own which try to compile a list of bad things Eliezer thinks and has said which the poster thinks is really terrible, but which seem benign to me. This is my theory of why they are all low-quality, and yet still posted:
Many have an inflated opinion of Eliezer, and when they realize he’s just as epistemically mortal as the rest of us, they feel betrayed, and so overupdate towards thinking he’s less epistemically impressive than he actually is, so some of those people compile lists of grievances they have against him, and post them on LessWrong, and claim this shows Eliezer is confidently egregiously wrong most of the time he talks about anything. In fact, it just shows that the OP has different opinions in some domains than Eliezer does, or that Eliezer’s track-record is not spotless, or that Eliezer is overconfident. All claims that I, and other cynics & the already disillusioned already knew or could have strongly inferred.
Eliezer is actually pretty impressive both in his accomplishments in epistemic rationality, and especially instrumental rationality. But pretty impressive does not mean godlike or perfect. Eliezer does not provide ground-truth information, and often thinking for yourself about his claims will lead you away from his position, not towards it. Maybe this is something he should have stressed more in his Sequences.
I don’t find Eliezer that impressive, for reasons laid out in the article. I argued for animal sentient extensively in the article. Though the main point of the article wasn’t to establish nonphysicalism or animal consciousness but that Eliezer is very irrational on those subjects.
I don’t know if Eliezer is irrational about animal consciousness. There’s a bunch of reasons you can still be deeply skeptical of animal consciousness even if animals have nocioceptors (RL agents have nocioceptors! They aren’t conscious!), or integrated information theory & global workspace theory probably say animals are ‘conscious’. For example, maybe you think consciousness is a verbal phenomenon, having to do with the ability to construct novel recursive grammars. Or maybe you think its something to do with the human capacity to self-reflect, maybe defined as making new mental or physical tools via methods other than brute force or local search.
I don’t think you can show he’s irrational here, because he hasn’t made any arguments to show the rationality or irrationality of. You can maybe say he should be less confident in his claims, or criticize him for not providing his arguments. The former is well known, the latter less useful to me.
I find Eliezer impressive, because he founded the rationality community which IMO is the social movement with by far the best impact-to-community health ratio ever & has been highly influential to other social moments with similar ratios, knew AI would be a big & dangerous deal before virtually anyone, worked on & popularized that idea, and wrote two books (one nonfiction, and the other fanfiction) which changed many peoples’ lives & society for the better. This is impressive no matter how you slice it. His effect on the world will clearly be felt for long to come, if we don’t all die (possibly because we don’t all die, if alignment goes well and turns out to have been a serious worry, which I am prior to believe). And that effect will be positive almost for sure.
What does this mean?
I believe that the section on decision theory is somewhat misguided in several ways. Specifically, I don’t perceive FDT as a critical error. However, I should note that I’m not an expert on decision theory, so please consider my opinion with a grain of salt.
(I generally agree with the statements “Eliezer is excessively overconfident” and “Eliezer has a poor epistemic track record”. Specifically, I believe that Eliezer holds several incorrect and overconfident beliefs about AI, which, from my perspective, seem like significant mistakes. However, I also believe that Eliezer has a commendable track record of intellectual outputs overall, just not a strong epistemic or predictive one. And, I think that FDT seems like a reasonable intellectual contribution and is perhaps our best guess at what decision theory looks like for optimal agents.)
I won’t spend much time advocating for FDT, but I will address a few specific points.
(I think you flipped the probabilities in the original post. I flipped them to what I think is correct in this block quote.)
I believe that what FDT does here is entirely reasonable. The reason it may seem unreasonable is because we’re assuming extreme levels of confidence. It seems unlikely that you shouldn’t succumb to the blackmail, but all of that improbability resides in the 1/googol probabilities you proposed. This hypothetical also assumes that the FDT reasoner assigns a 100% probability to always following FDT in any counterfactual, an assumption that can probably be relaxed (though this may be challenging due to unresolved issues in decision theory?).
For an intuitive understanding of why this is reasonable, imagine the blackmailer simulates you to understand your behavior, and you’re almost certain they don’t blackmail people who ignore blackmail in the simulation. Then, when you’re blackmailed, your epistemic state should be “Oh, I’m clearly in a simulation. I won’t give in so that my real-world self doesn’t get blackmailed.” This seems intuitively reasonable to me, and it’s worth noting that Causal Decision Theory (CDT) would do the same, provided you don’t have indexical preferences. The difference is that FDT doesn’t differentiate between simulation and other methods of reasoning about your decision algorithm.
In fact, I find it absurd that CDT places significant importance on whether entities reasoning about the CDT reasoner will use simulation or some other reasoning method; intuitively, this seems nonsensical!
I think it’s worth noting that both evidential decision theory (EDT) and causal decision theory (CDT) seem quite implausible to me. Optimal agents following either decision theory would self-modify into something else to perform better in scenarios like Transparent Newcomb’s Problem.
I think decision theories are generally at least counterintuitive, so this isn’t a unique problem with FDT.
Your points I think are both addressed by the point MacAskill makes that, perhaps in some cases it’s best to be the type of agent that follows functional decision theory. Sometimes rationality will be bad for you—if there’s a demon who tortures all rational people, for example. And as Schwarz points out, in the twin case, you’ll get less utility by following FDT—you don’t always want to be a FDTist.
I find your judgment about the blackmail case crazy! Yes, agents who give in to blackmail do worse on average. Yes, you want to be the kind of agent who never gives in to blackmail. But all of those are consistent with the obvious truth that giving into blackmail, once you’re in that scenario, makes things worse for you and is clearly irrational.
At some point this gets down to semantics. I think a reasonable question to answer is “what decision rule should be chosen by an engineer who wants to build an agent scoring the most utility across its lifetime?” (quoting from Schwarz). I’m not sure if the answer to this question is well described as rationality, but it seems like a good question to answer to me. (FDT is sort of an attempted answer to this question if you define “decision rule” somewhat narrowly.)
Suppose that I beat up all rational people so that they get less utility. This would not make rationality irrational. It would just mean that the world is bad for the rational. The question you’ve described might be a fine one, but it’s not what philosophers are arguing about in Newcombe’s problem. If Eliezer claims to have revolutionized decision theory, and then doesn’t even know enough about decision theory to know that he is answering a different question from the decision theorists, that is an utter embarrassment that significantly undermines his credibility.
And in that case, Newcombe’s problem becomes trivial. Of course if Newcombe’s problem comes up a lot, you should design agents that one box—they get more average utility. The question is about what’s rational for the agent to do, not what’s rational for it to commit to, become, or what’s rational for its designers to do.
I can’t seem to find this in the linked blog post. (I see discussion of the twin case, but not a case where you get less utility from precommiting to follow FDT at the start of time.)
What about the simulation case? Do you think CDT with non-indexical preferences is crazy here also?
More generally, do you find the idea of legible precommitment to be crazy?
Sorry, I said twin case, I meant the procreation case!
The simulation case seems relevantly like the normal twin case which I’m not as sure about.
Legible precommitment is not crazy! Sometimes, it is rational to agree to do the irrational thing in some case. If you have the ability to make it so that you won’t later change your mind, you should do that. But once you’re in that situation, it makes sense to defect.
As far as I can tell, the procreation case isn’t defined well enough in Schwarz for me to enage with it. In particular, in what exact way are the decision of my father and I entangled? (Just saying the father follows FDT isn’t enough.) But, I do think there is going to be a case basically like this where I bite the bullet here. Noteably, so does EDT.
Your father followed FDT and had the same reasons to procreate as you. He is relevantly like you.
That would mean that believed he had a father with the same reasons, who believed he had a father with the same reasons, who believed he had a father with the same reasons...
I.e., this would require an infinite line of forefathers. (Or at least of hypothetical, believed-in forefathers.)
If anywhere there’s a break in the chain — that person would not have FDT reasons to reproduce, so neither would their son, etc.
Which makes it disanalogous from any cases we encounter in real life. And makes me more sympathetic to the FDT reasoning, since it’s a stranger case where I have less strong pre-existing intuitions.
...which makes the Procreation case an unfair problem. It punishes FDT’ers specifically for following FDT. If we’re going to punish decision theories for their identity, no decision theory is safe. It’s pretty wild to me that @WolfgangSchwarz either didn’t notice this or doesn’t think it’s a problem.
A more fair version of Procreation would be what I have called Procreation*, where your father follows the same decision theory as you (be it FDT, CDT or whatever).
Cool, so you maybe agree that CDT agents would want to self modify into something like FDT agents (if they could). Then I suppose we might just disagree on the semantics behind the word rational.
(Note that CDT agents don’t exactly self-modify into FDT agents, just something close.)
For consciousness, I understand your Casper type-D dualism argument as something like this:
The mainstream physics position is that there is a causally-closed “theory of everything”, and we don’t know it exactly yet but it will look more-or-less like some complicated mathematical formula that reduces to quantum field theory (QFT) in one limit, and reduces to general relativity (GR) in a different limit, etc., like probably some future version of string theory or whatever. Let’s call such a law “Law P”.
But that’s just a guess. We don’t know it for sure. Maybe if we did enough experiments with enough accuracy, especially experiments involving people and animals, then we can find places Law P gives wrong predictions. So maybe it will turn out that these “ultimate laws of the universe” involve not only Law P but also Law C (C for Casper or Consciousness) which is some (perhaps very complicated) formula describing how Casper / Consciousness pushes and pulls on particles in the real world, or whatever.
And next, you’re saying that this hypothetical is compatible with zombie-ism because Law C “really” is there because of consciousness, but it’s conceivable for there to be a zombie universe in which Law C is still applicable but those forces are not related to consciousness. (I’m referring to the part where you say: “So the idea is that even if consciousness causes things, we could still imagine a physically identical world to {the world where consciousness causes the things}. Instead, the things would be caused the same physical way as they are with consciousness, but there would be no consciousness.”)
But isn’t that still epiphenomenalism?? Why can’t we say “the real physical laws of the universe are {Law P and Law C}, and consciousness is epiphenomenal upon those physical laws”?
Hmm. One possible response would be: Law C is supposed to be natural-seeming in the consciousness-universe, and unnatural-seeming / convoluted / weird in the zombie-universe. Is that how you’d respond? Or something else? If that’s the response, I think I don’t buy it. I think it’s exactly equally convoluted. If all the complexity of psychology is bundled up inside Law C, then that’s true whether consciousness is involved or not.
~~~
Well anyway, I’m interested in understanding the above line of argument but it’s ultimately not very cruxy for me. My real reason for disbelieving to dualism is something like this:
OK, this isn’t really an “argument” as stated, because I’m not explaining why I’m so optimistic about this project. Basically, I see each step as well under way, maybe not with every last detail hammered out, but also with no great mysteries left. (See here for one part of the story.) I have the impression (I forget exactly what he wrote that makes me think this) that Eliezer feels the same way, and that might have been an unspoken assumption when he was writing. (If so, I agree that he should have stated it explicitly, and acknowledged that there’s some scope for reasonable people to disagree with that optimism.)
This (framework of an) argument would be an argument against both Chalmers’s “type-D dualism” and his “type-F monism”, I think. (I’m not 100% sure though, because I’m pretty confused by Chalmers’s discussion of type-F monism.)
It doesn’t work against (sane variants of) type-F monism—it predicts same things equations predict.
Which “sane variants” do you have in mind? Can you suggest any (ideally pedagogical) references?
I’m currently skeptical, if we’re working under the assumption that the framework / project I suggested above (i.e., the project where we start with QFT+GR and systematically explain that entire chain of causation of David Chalmers typing up book chapters about consciousness) will be successful and will not involve any earth-shattering surprises.
If that assumption is true, then we have two postulates to work with: (1) microphysics is causally closed, (2) phenomenal consciousness, whatever it is (if it’s anything at all) is the thing that David Chalmers is able to successfully introspect upon when he writes book chapters talking about phenomenal consciousness (per the argument that Eliezer was putting forward).
I acknowledge that type-F monism is compatible with (1), but it seems to me that it fails (2). When future scientists tell the whole story of how QFT+GR leads to David Chalmers writing books about consciousness, I think there will be no room in that story for the “intrinsic phenomenal properties of quarks” (or whatever) to be involved in that story, in a way that would properly connect those intrinsic properties to David’s introspection process.
But again, this is pretty tentative because I remain confused about type-F monism in the first place. :)
Not sure about references, maybe Panpsychism and Panprotopsychism and The Combination Problem for Panpsychism by Chalmers?
The sane variant is cosmopsychism (because real physical objects are not fundamentally separable) with probably only one intrinsic property—existence.
From your linked paper:
The story is about intrinsic properties, because that’s what equations describe—when you describe some physical system, it is implied that described physical system exists and all it’s casual influence is because of it’s existence. And you introspect the existence of universe itself by cogito ergo sum.
Thanks for the references.
Hmm, I don’t think you’re following me (or I’m not following you).
Let’s say Chalmers introspects for a bit, and then writes the sentence “I am currently experiencing the ineffable redness of red.”, or alternatively that Chalmers writes the sentence “I am not currently experiencing the ineffable redness of red.”. Presumably (since you seem not to be an illusionist), you would endorse the claim that Chalmers chooses one sentence over the other based on its truth value, and that he ascertained that truth value by introspection, and that choice is intimately / causally related to properties of phenomenal consciousness. Right? So then we can ask questions like: What exactly (if anything) was he introspecting upon, and how was he doing so, and how did he interpret the results in order to choose one sentence over the other? If phenomenal consciousness is anything at all, it presumably needs to be the thing that Chalmers is somehow able to query during this introspection process, right? So how does this querying process work, and produce specific correct answers to questions about consciousness?
If your version of cosmopsychism is just the idea of saying “the universe exists, and this existence is an intrinsic property of the equations” or whatever, then I’m open-minded to that. (Related.) But I don’t understand how that has anything to do with the question above. Right? You’re calling it “cosmopsychism” but I don’t see any “psych” in it…
The thing that Chalmers queries is his brain. The phenomenal nature of his brain is that his brain exists. Chalmers can’t query a brain that doesn’t exist. Therefore phenomenal things cause Chalmers to say “I am currently experiencing the ineffable redness of red.”.
Existence have something to do with everything—you can’t introspect or see red if you don’t exist. But yes, the solution to the Hard Problem doesn’t have much to do with human qualia specifically (except maybe in the part where cogito ergo sum is the limit of reflectivity and awareness in humans have something to do with reflectivity) - if you explain the consciousness itself using physical notion of existence, then the redness of red is just the difference of neural processes.
That it doesn’t have any magical “psych” is by design—that’s why it’s not dualism. The relevant phenomenal aspect of existence is that it solves zombies. And I mean, sure, you can avoid using the word “consciousness” and stick only to “existence”. But it connects to your intuitions about consciousness—if you imagined you may lose consciousness if you were to be disassembled to atoms and reassembled back, now you have a direct reason to imagine that it still would be something to be like reassembled you.
I feel like you’re dodging the question here. I can make a list of properties X that Chalmers believes consciousness has, and I can make another list of properties Y that Chalmers believes consciousness doesn’t have. There should be an explanation of why the things on list X are not on Y instead, and vice-versa.
You can’t just say “if the universe didn’t exist, then lists X and Y wouldn’t exist either”. Sure, maybe that’s true, but it doesn’t constitute any progress towards explaining why the X&Y list contents are what they are. Right?
I feel like you’re angling for a position kinda like: “cosmopsychism explains why I have a conscious experience, but explains nothing whatsoever about any of the properties of that conscious experience”. Right? That strikes me as kinda an insane thing to believe…
Like, if I say “this rock was formed by past volcanic activity”, then you can dig a bit deeper and relate specific properties of the rock to known properties of volcanoes. Right? That’s the normal state of affairs.
So if you say “I know the explanation of why there’s conscious experience, but that explanation doesn’t offer even one shred of insight about why conscious experience is related to memory and self-awareness and feelings and first-person perspective etc., as opposed to conscious experience being related to this pile of blankets on my couch” … then I reject that the thing you’re saying is actually the explanation of why there’s conscious experience.
I hope this isn’t coming across as mean. You seem pretty reasonable, maybe you’ll have good responses, that’s why I’m still here chatting :)
Not intentional—I’m just not sure whether you see problems with casual closure or with epistemic usefulness.
I wouldn’t say “any progress”—correct propositions about X and Y are correct even if they may seem trivial. And it sure would be a progress for someone who was forced to believe in dualism or worse as an alternative. And, to be clear, consciousness having a content is not a problem for casual closure—if some specific universe didn’t exist and some other universe existed instead, X and Y would be different. But yes, it’s a solution that is not especially useful except for a narrow purpose of solving the Hard Problem.
Right. Well, if you stretch the definition of an explanation and properties, there are some vague intuition-like mental processes that I believe become streamlined when you accept cosmopsychism. Like, at the stage of “we have no idea how to solve the Hard Problem but I’m sure physicalism will win somehow” people still manage to hope for some kind of moral realism about consciousness, like there is objective fact that someone is in pain. But yeah, you may derive all this stuff from other sources too.
But, why do you think it’s insane? There are no philosophical problems with relation of some mental processes to memory. Science will explain it in the future just fine. “Why there’s conscious experience” always was the only mysterious problem about consciousness. And I’m not even saying that the Hard Problem and it’s solution is interesting, while practical theory of awareness is boring and useless. It’s just as the matter of fact under some reasonable definitions cosmopsychism solves the Hard Problem—that’s the extent of what I’m arguing.
The point is that cosmopsychism together with usual science provides the explanation you want. And no one doubts that science will do it’s part.
It’s not epiphenomenalism because the law invokes consciousness. On the interactionalist account, consciousness causes things rather than just the physical stuff causing things. If you just got rid of consciousness, you’d get a physically different world.
I don’t think that induction on the basis of “science has explained a lot of things therefore it will explain consciousness” is convincing. For one, up until this point, science has only explained physical behavior, not subjective experience. This was the whole point (see Goff’s book Galileo’s error). For another, this seems to prove too much—it would seem to suggest that we could discover the corect modal beliefs in a test tube.
First of all, I was making the claim “science will eventually be able to explain the observable external behavior wherein David Chalmers moves his fingers around the keyboard to type up books about consciousness”. I didn’t say anything about “explaining consciousness”, just explaining a particular observable human behavior.
Second of all, I don’t believe that above claim because of induction, i.e. “science can probably eventually explain the observable external behavior of Chalmers writing books about consciousness because hey, scientists are smart, I’m sure they’ll figure it out”. I agree that that’s a pretty weak argument. Rather I believe that claim because I think I already know every step of that explanation, at least in broad outline. (Note that I’m stating this opinion without justifying it.)
OK, but then the thing you’re talking about is not related to p-zombies, right?
I thought the context was: Eliezer presented an argument against zombies, and then you / Chalmers say it’s actually not an argument against zombies but rather an argument against epiphenomenalism, and then you brought up the Casper thing to illustrate how you can have zombies without epiphenomenalism. And I thought that’s what we were talking about. But now you’re saying that, in the Casper thing, getting rid of consciousness changes the world, so I guess it’s not a zombie world?
Maybe I’m confused. Question: if you got rid of consciousness, in this scenario, does zombie-Chalmers still write books about consciousness, or not? (If not, that’s not zombie-Chalmers, right? Or if so, then isn’t it pretty weird that getting rid of consciousness makes a physically different world but not in that way? Of all things, I would think that would be the most obvious way that the world would be physically different!!)
If you only got rid of consciousness behavior would change.
You might be able to explain Chalmers’ behavior, but that doesn’t capture the subjective experience.
Oh, I see, the word “only” here or “just” in your previous comment were throwing me off. I was talking about the following thing that you wrote:
[single quotes added to fix ambiguous parsing.]
Let’s label these two worlds:
World A (“the world where consciousness causes the things”), and
World B (the world where “the things would be caused the same physical way as they are with consciousness, but there would be no consciousness”).
Your perspective seems to be: “World A is the truth, and World B is a funny thought experiment. This proposal is type-D dualist.”
I am proposing an alternative perspective: “World B is the true causally-closed physical laws of the universe (and by the way, the laws of physics maybe look different from how we normally expect laws of physics to look, but oh well), and World A is an physically equivalent universe but where consciousness exists as an epiphenomenon. This proposal is type-E epiphenomenalist.”
Is there an error in that alternative perspective?
Let’s say I write the sentence: “my wristwatch is black”. And let’s say that sentence is true. And let’s further say it wasn’t just a lucky guess. Under those assumptions, then somewhere in the chain of causation that led to my writing that sentence, you will find an actual watch, and it’s actually black, and photons bounced off of that watch and went into my eye (or someone else’s eye or a camera etc.), thus giving me that information. Agree?
By the same token: Let’s say that Chalmers writes the sentence “I have phenomenal consciousness, and it has thus-and-such properties”. And let’s say that sentence is true. And let’s further say it wasn’t just a lucky guess. Under those assumptions, then somewhere in the chain of causation that led Chalmers to write that sentence, you will find phenomenal consciousness, whatever it is (if anything), with an appropriate place in the story to allow Chalmers to successfully introspect upon it—to allow Chalmers to somehow “query” phenomenal consciousness with his brain and wind up with veridical knowledge about it, analogous to how photons bounce off the watch and carry veridical information about its optical properties into the retina and eventually into long-term memory.
I claim that, if the project I proposed here is successful (i.e. the project to get from QFT+GR to the external behavior of Chalmers writing books), and we combine that with the argument of the previous paragraph (which I understand to be Eliezer’s argument), then we get a rock-solid argument that rules out all zombies, whether type-D, type-E, or type-F. Do you see what I mean?
I felt like I was following the entire comment, until you asserted that it rules out zombies.
This piece should be a test case for LW’s rate-limiting policies. If a post gets −24 karma on 75 votes and lots of engagement, implying something like 40-44% positive reception, should its author be able to reply in the comments and keep posting? My guess is yes in this case, for the sake of promoting healthy disagreement.
Most (but not all) automatic rate limits allow authors to continue to comment on their own posts, since in many such cases it does indeed seem likely that preventing that would be counterproductive.
I suggest maybe re-titling this post to:
”I strongly disagree with Eliezer Yudkowsky about the philosophy of consciousness and decision theory, and so do lots of other academic philosophers”
or maybe:
”Eliezer Yudkowsky is Frequently, Confidently, Egregiously Wrong, About Metaphysics”
or consider:
”Eliezer’s ideas about Zombies, Decision Theory, and Animal Consciousness, seem crazy”
Otherwise it seems pretty misleading / clickbaity (and indeed overconfident) to extrapolate from these beliefs, to other notable beliefs of Eliezer’s—such as cryonics, quantum mechanics, macroeconomics, various political issues, various beliefs about AI of course, etc. Personally, I clicked on this post really expecting to see a bunch of stuff like “in March 2022 Eliezer confidently claimed that the government of Russia would collapse within 90 days, and it did not”, or “Eliezer said for years that X approach to AI couldn’t possibly scale, but then it did”.
Personally, I feel that beliefs within this narrow slice of philosophy topics are unlikely to correlate to being “egregiously wrong” in other fields. (Philosophy is famously hard!! So even though I agree with you that his stance on animal consciousness seems pretty crazy, I don’t really hold this kind of philosophical disagreement against people when they make predictions about, eg, current events.)
Philosophy is pretty much the only subject that I’m very informed about. So as a consequence, I can confidently say Eliezer is eggregiously wrong about most of the controversial views I can fact check him on. That’s . . . worrying.
Some other potentially controversial views that a philosopher might be able to fact-check Eliezer on, based on skimming through an index of the sequences:
Assorted confident statements about the obvious supremacy of Bayesian probability theory and how Frequentists are obviously wrong/crazy/confused/etc. (IMO he’s right about this stuff. But idk if this counts as controversial enough within academia?)
Probably a lot of assorted philosophy-of-science stuff about the nature of evidence, the idea that high-caliber rationality ought to operate “faster than science”, etc. (IMO he’s right about the big picture here, although this topic covers a lot of ground so if you looked closely you could probably find some quibbles.)
The claim / implication that talk of “emergence” or the study of “complexity science” is basically bunk. (Not sure but seems like he’s probably right? Good chance the ultimate resolution would probably be “emergence/complexity is a much less helpful concept than its fans think, but more helpful than zero”.)
A lot of assorted references to cognitive and evolutionary psychology, including probably a number of studies that haven’t replicated—I think Eliezer has expressed regret at some of this and said he would write the sequences differently today. But there are probably a bunch of somewhat-controversial psychology factoids that Eliezer would still confidently stand by. (IMO you could probably nail him on some stuff here.)
Maybe some assorted claims about the nature of evolution? What it’s optimizing for, what it produces (“adaptation-executors, not fitness-maximizers”), where the logic can & can’t be extended (can corporations be said to evolve? EY says no), whether group selection happens in real life (EY says basically never). Not sure if any of these claims are controversial though.
Lots of confident claims about the idea of “intelligence”—that it is a coherent concept, an important trait, etc. (Vs some philosophers who might say there’s no one thing that can be called intelligence, or that the word intelligence has no meaning, or generally make the kinds of arguments parodied in “On the Impossibility of Supersized Machines”. Surely there are still plenty of these philosophers going around today, even though I think they’re very wrong?)
Some pretty pure philosophy about the nature of words/concepts, and “the relationship between cognition and concept formation”. I feel like philosophers have a lot of hot takes about linguistics, and the way we structure concepts inside our minds, and so forth? (IMO you could at least definitely find some quibbles, even if the big picture looks right.)
Eliezer confidently dismissing what he calls a key tenet of “postmodernism” in several places—the idea that different “truths” can be true for different cultures. (IMO he’s right to dismiss this.)
Some pretty confident (all things considered!) claims about moral anti-realism and the proper ethical attitude to take towards life? (I found his writing helpful and interesting but idk if it’s the last word, personally I feel very uncertain about this stuff.)
Eliezer’s confident rejection of religion at many points. (Is it too obvious, in academic circles, that all major religions are false? Or is this still controversial enough, with however many billions of self-identified believers worldwide, that you can get credit for calling it?)
It also feels like some of the more abstract AI alignment stuff (about the fundamental nature of “agents”, what it means to have a “goal” or “values”, etc) might be amenable to philosophical critique.
Maybe you toss out half of those because they aren’t seriously disputed by any legit academics. But, I am pretty sure that at least postmodern philosophers, “complexity scientists”, people with bad takes on philosophy-of-science / philosophy-of-probability, and people who make “On the Impossibility of Supersized Machines”-style arguments about intelligence, are really out there! They at least consider themselves to be legit, even if you and I are skeptical! So I think EY would come across with a pretty good track record of correct philosophy at the end of the day, if you truly took the entire reference class of “controversial philosophical claims” and somehow graded how correct EY was (in practice, since we haven’t yet solved philosophy—how close he is to your own views?), and compared this to how correct the average philosopher is.
His claims about Bayes go far beyond “better than frequentism”. He also claims it is can be used as the sole basis of epistemology, and that it is better than “science”. Bayes of course is not a one stop shop for epistemology, because It can’t generate hypotheses, or handle paradigm shifts. It’s also far too complex to use of in practice, for informal decision making. Most “Bayesians” are deceiving themselves about how much they are using it.
Almost his only argument for science wrong, Bayes right is the supposedly “slam dunk” nature of MWI—which, oddly, you dont mention directly.
Talk of emergence without any mechanism of emergence is bunk, but so is talk of reductionism without specific reductive explanations. Which is a live issue, because many rationalists do regard reductionism as a necessary and apriori. Since it isn;’t, other models and explanations are possible—reduction isn’t necerssary, so emergence is possible.
Is that good or bad?
That’s obviously true of a subset of claims, eg what counts as money, how fast you are allowed to drive. It would be false if applied to everything , but is very difficult to find a postmodernists who says so in so many words.
I have never discerned a single clear theory of ethics or metaethics in Yudkowky’s writing. The linked article does not make a clear commitment to either realism or anti realism AFAICS. IMO he has as many as four theories.
0, The Argument Against Realism, maybe.
The Three Word Theory (Morality is values).
Coherent Extrapolated Volition
Utilitarianism of Some Variety.
The argument for atheism from Solomonoff induction is bizarre .
SI can only work in an algorithmic universe. Inasmuch as it is considering hypotheses, it is considering which algorithm is actually generating observed phenomena. It can’t consider and reject any non algorithmic hyposthesis, incuding non-algorithmic (non Turing computable) physics. Rationalists believe that SI can resolve theology in the direction of atheism. Most theology regards God as supernatural or non-physical....but it is very doubtful that SI can even consider a supernatural deity.If SI cannot consider the hypothesis of supernaturalism, it cannot reject it. At best, if you allow that it can consider physical hypotheses, it can only consider a preternatural deity, a Ray Harrihausen god, that’s big and impressive, but still material and finite.
This is a frequently-made accusation which has very little basis in reality. The world is a big place, so you will be able to find some examples of such people, but central examples of LessWrong readers, rationalists, etc, are not going around claiming that they run their entire lives on explicit Bayes.
Nonetheless, the founder claims they should be.
Pretty sure it’s just false.
First found example: the last post by EY
That’s a story where he thinks he should do a Bayesian analysis, then doesn’t. It’s not a story where no one should do one.
Good point that rationalism is over-emphasizing the importance of Bayes theorem in a pretty ridiculous way, even if most of the individual statements about Bayes theorem are perfectly correct. I feel like if one was trying to evaluate Eliezer or the rationalist community on some kind of overall philosophy scorecard, there would be a lot of situations like this—both “the salience is totally out of whack here even though it’s not technically /wrong/...”, and “this seems like a really important and true sentiment, but it’s not really the kind of thing that’s considered within the purview of academic philosophy...” (Such as the discussion about ethics / morality / value, and many other parts of the Sequences… I think there is basically a lot of helpful stuff in those posts, some of which might be controversial, but it isn’t really an Official Philosophical Debate over stuff like whether anti-realism is true. It’s more like “here’s how I think you should live your life, IF anti-realism is true”.)
Didn’t mention many-worlds because it doesn’t feel like the kind of thing that a philosopher would be fully equipped to adjudicate? I personally don’t feel like I know enough to have opinions on different quantum mechanics interpretations or other issues concerning the overall nature / reality of the universe—I still feel very uncertain and confused about that stuff, even though long ago I was a physics major and hoped to some day learn all about it. Although I guess I am sorta more sympathetic to Many Worlds than some of the alternatives?? Hard to think about, somehow...
Philosophers having hot takes on linguistics and the relationship between words and concepts—not good or bad that they have so many takes, and I’m also not sure if the takes themselves are good or bad. It is just my impression that, unlike some of the stuff above, philosophy seems to have really spent a lot of time debating these issues, and thus it would be ripe for finding well-formed disagreements between EY and various mainstream schools of thought. I do think that maybe philosophers over-index a little on thinking about the nature of words and language (ie that they have “too many takes”), but that doesn’t seem like such a bad thing—I’m glad somebody’s thinking about it, even if it doesn’t strike me as the most important area of inquiry!
Yeah, agreed that that Solomonoff induction argument feels very bizzarre! I had never encountered that before. I meant to refer to the many different arguments for atheism sprinkled throughout the Sequences, including many references to the all-time classic idea that our discovery of the principles of evolution and the mechanics of the brain are sufficient to “explain away” the biggest mysteries about the origin of humanity, and should thus sideline the previously-viable hypothesis of religious claims being true. (See here and here.) EY seems to (rightly IMO) consider the falseness of major religious claims to be a “slam dunk”, ie, totally overdetermined to be false—the Sequences are full of funny asides and stories where various religious people are shown to be making very obvious reasoning errors, etc.
I don’t see any issue with the claimed FDT decisions in the blackmail or procreation case, assuming the (weird) preconditions are met. Spelling out the precise weirdness of the preconditions makes the reasonableness more apparent.
In the blackmail case: what kind of evidence, exactly, convinced you that the blackmailer is so absurdly good at predicting the behavior of other agents so reliably? Either:
(a) You’re mistaken about your probability estimate that the blackmailer’s behavior is extremely strongly correlated to your own decision process, in which case whether you give into the blackmail depends mostly on ordinary non-decision theory-related specifics of the real situation.
(b) You’re not mistaken, which implies the blackmailer is some kind of weird omniscient entity which can actually link its own decision process to the decision process in your brain (via e.g. simulating you), in which case, whatever (absurdly strong and apriori-unlikely) evidence managed to convince you of such an entity’s existence and the truth of the setup, should probably also convince you that you are being simulated (or that something even stranger is going on).
And if you as a human actually find yourself in a situation where you think you’re in case (a) or (b), your time is probably best spent doing the difficult (but mostly not decision theory-related) cognitive work of figuring out what is really going on. You could also spend some of your time trying to figure out what formal decision theory to follow, but even if you decide to follow some flavor of FDT or CDT or EDT as you understand it, there’s no guarantee you’re capable of making your actual decision process implement the formal theory you choose faithfully.
In the procreation case, much of the weirdness is introduced by this sentence:
What does it mean to value existing, precisely? You prefer that your long-term existence is logically possible with high probability? You care about increasing your realityfluid across the multiverse? You’re worried you’re currently in a short-lived simulation, and might pop out of existence once you make your decision about whether to procreate or not?
Note that although the example uses suggestive words like “procreate” and “father” to suggest that the agent is a human, the agent and its father are deciding whether to procreate or not by using FDT, and have strange, ill-defined, and probably non-humanlike preferences about existence. If you make the preconditions in the procreation example precise and weird enough, you can make it so that the agent should conclude with high probability that it is actually in some kind of simulation by its ancestor, and if it cares about existence outside of the simulation, then it should probably choose to procreate.
In general, any time you think FDT is giving a weird or “wrong” answer, you’re probably failing to imagine in sufficient detail what it would look and feel like to be in a situation where the preconditions were actually met. For example, any time you think you find yourself in a true Prisoner’s Dilemma and are considering whether to apply some kind of formal decision theory, functional or otherwise, start by asking yourself a few questions:
What are the chances that your opponent is actually the equivalent of a rock with “Cooperate” or “Defect” written on it?
What are the chances that you are functionally the equivalent of a rock with “Cooperate” or “Defect” written on it? (In actual fact, and from your opponent’s perspective.)
What are the chances that either you or your opponent are functionally the equivalent of RandomBot, either in actual reality or from each other’s perspectives?
If any of these probability estimates are high, you’re probably in a situation where FDT (or any formal or exotic decision theory) doesn’t actually apply.
Understanding decision theory is hard, and implementing it as a human in realistic situations it is even harder. Probably best not to accuse others of being “confidently and egregiously wrong” about things you don’t seem to have a good grasp of yourself.
In the blackmail case, we’re just stipulating that the scenario is as described. It doesn’t matter why it is that way.
In the procreation case, I don’t know why they have to be inhuman. They’re just acting for similar reasons to you.
Two things:
First, you mention Jacob Cannell as an authoritative-sounding critic of Eliezer’s AI futurology. In fact, Jacob’s claims about the brain’s energetic and computational efficiency were based on a paradigm of his own, the “Landauer tile” model.
Second, there’s something missing in your discussion of the anti-zombie argument. The physical facts are not just what happens, but also why it happens—the laws of physics. In your Casper example, when you copy Casper’s world, by saying “Oh, and also...”, you are changing the laws.
This has something to do with the status of interactionism. You say Eliezer only deals with epiphenomenalism, what about interactionism? But interactionist dualism already deviates from standard physics. There are fundamental mental causes in interactionism, but not in standard physics.
On Cannell, as I said, I’m too ignorant to evaluate his claims in detail. My claim is just there are smart sounding people who claim Eliezer is naive about AI.
On the zombie argument, the physical facts are not why it happens in the relevant sense. If god causes a couch to disappear in one world, that is physically identical to another world in which Allah caused the couch to disappear, which is physically identical to a third world in which there is a fundamental law that causes it to disappear. Physical identity has to do with the way that the physical stuff composing a world behaves.
That’s true for all fields where there are experts on a subject who use one paradigm and other people who propose a different paradigm.
It tells you little about the merit of alternative paradigms.
Says who? If you divide your ontology however you want, you can have a conceivability argument about non-physicality of melons. Which is, by the way, is addressed in Eliezer’s reply to Chalmers.
In our world my laptop doesn’t fall because there is a table under it. In another world Flying Spaghetty Monster holds my laptop. And also FSM sends light in my (version of me from other world) eyes, so I think there is a table. And FSM copies all other causal effects which are caused by the table in our world. This other world is imaginable, therefore, the table is non-physical. What exactly makes this a bad analogy with your line of thought?
I think Eliezer’s writing are exactly what you would expect from someone who is extremely intelligent, with the common additional factors in highly intelligent people of distrusting authority (because it tends to be less intelligent than you), and only skimming expert texts (because as a child, for most texts you were exposed to, you either understood them immediately, or the texts had issues, so you are interpreting a text that leaves you confused at first as evidence that the text is wrong), while delving with hyperfocus into texts that are often overlooked, many of which are valuable.
That is, you get a very intelligent, very rational, well articulated, unusual outside view on a cursory perception of a problem. With the unusual bonus that Eliezer really tries to be a good person, too.
This can be immensely valuable. When you are working in a field for a long time, you can get stuck in modes of thinking, and this sort of outside view can help you step back and notice that you are doing something fundamentally wrong.
But it is almost certainly crucially incomplete, wrong in the technical details, impractical, and sometimes, even often, plain wrong. But sometimes, they are brilliantly right.
Such texts are still very much worth reading, I think. Not as your base knowledge, you will get completely lost. But as an addition to a general education. Especially because he writes them in a manner that makes them very pleasant to read; he writes and explains very well. It is rare for someone to include relevant information and lessons in stories that are plain fun to read, and sometimes really inspiring: HPMOR made me weep, and I am rereading it now for the third time, and still admiring it.
But you need to take everything Eliezer writes with a grain of salt, and double check how the experts are actually representing themselves. Do not trust him when he represents something as settled. In this sense, it is worth knowing about him as a character—if you just check his arguments, but trust his premises to be true as reported by him, you will be mislead, and you need to know that the premises may well not be.
Animal consciousness, especially pain, especially in non-human mammals, is indeed well established. Happy to explain more if someone doubts this, this is actually something I am academically qualified for.
And while I do not find philosophical zombies or non-physicalism plausible, Eliezer indeed badly misrepresents that debate in a way that even a first year philosophy student with passing knowledge of the subject would find egregious.
And you didn’t pick up on it here, but the way that Eliezer represents ethics is terrible. Naive utilitarianism is not just not the default assumption in ethics, it is widely considered deeply problematic for good reasons, and telling people they are irrational for doubting it is really problematic.
I’ve also been told by experts in physics that I trust that his quantum take, while not as bad as you’d think, is far from perfect or settled, and loudly and repeatedly by people in AI that the technical aspects of his solutions are utterly impractical. And on a lot of AI safety issues, my impression is that he seems to have settled into a stance that is at least partially emotionally motivated, no longer questioning his own assumptions or framework.
You can tell people that his positions are very controversial, and that there are actually good reasons for that. You can highlight to people that relying on your intelligence alone and disregarding the lifetime achievements of other humans as stuff that you can probably make up yourself in better in an afternoon is misguided, and will leave you disconnected from the rest of humanity and ignorant of important stuff. But I think on average, reading some of Eliezer’s stuff will do people good, and he does a lot of stuff right in ways that deserve to be emulated and lauded.
I still consider him a very, very smart person with relevant ideas who tries really fucking hard to be rational and good and does much better than most, and who is worth listening to and treating with respect. He also strikes me as someone who is doing badly because the world is heading straight towards a future that is extremely high risk in ways he has warned about for a long time, and who hence really does not deserve bashing, but rather kindness. People forget that there is another person behind the screen, a person who is vulnerable, who can be having a terrible day. He’s become somewhat famous, but that doesn’t make him no longer a vulnerable human. I think you could have made these general points, and achieved the aim of teaching him to do better, or getting readers to read him more critically, without attacking him personally to this extent.
This is not my impression. (I am not an expert, though I studied philosophy, and specifically philosophy of mind, as an undergrad.) From what I know and have read of the debate, Eliezer’s depiction seems accurate.
What misrepresentation do you see, specifically?
I totally agree here, FWIW.
I doubt this. Please explain more!
(My best guess is that this disagreement hinges on equivocation between meanings of the word “consciousness”, but if there’s instead some knowledge of which I’m unaware, I’m eager to learn of it.)
I think this comment is entirely right until the very end. I don’t think I really attack him as a person—I don’t say he’s evil or malicious or anything in the vicinity, I just say he’s often wrong. Seems hard to argue that without arguing against his points.
FDT is a valuable idea in that it’s a stepping stone towards / approximation of UDT. Given this, it’s probably a good thing for Eliezer to have written about. Kind of like how Merkle’s Puzzles was an important stepping stone towards RSA, even though there’s no use for it now. You can’t always get a perfect solution the first time when working at the frontier of research. What’s the alternative? You discover something interesting but not quite right, so you don’t publish because you’re worried someone will use your discovery as an example of you being wrong?
Also:
Is this a typo? We desire not to be blackmailed, so we should give in and pay, since according to those odds, people who give in are almost never blackmailed. Therefore FDT would agree that the best policy in such a situation is to give in.
I was kind of hoping you had more mathematical/empirical stuff. As-is, this post seems to mostly be “Eliezer Yudkowsky Is Frequently, Confidently, and Egregiously In Disagreement With My Own Personal Philosophical Opinions”.
(I have myself observed an actual mathematical/empirical Eliezer-error before: He was arguing that since astronomical observations had shown the universe to be either flat or negatively curved, that demonstrated that it must be infinite. The error being that there are flat and negatively curved spaces that are finite due to the fact that they “loop around” in a fashion similar to the maze in Pac-man. (Another issue is that a flat universe is infinitesimally close to a positively curved one, so that a set of measurements that ruled out a positively curved universe would also rule out a flat one. Except that maybe your prior has a delta spike at zero curvature because simplicity. And then you measure the curvature and it it comes out so close to zero with such tight error bars that most of your probability now lives in the delta spike at zero. That’s a thing that could happen.))
EDIT: I’ve used “FDT” kind of interchangeably with “TDT” here, because in my way of viewing things, they’re very similar decision theories. But it’s important to note that historically, TDT was proposed first, then UDT, and FDT was published much later, as a kind of cleaned up version of TDT. From my perspective, this is a little confusing, since UDT seems superior to both FDT and TDT, but I guess it’s of non-zero value to go back and clean up your old ideas, even if they’ve been made obsolete. Thanks to Wei Dai for pointing out this issue.
You might be thinking of TDT, which was invented prior to UDT. FDT actually came out after UDT. My understanding is that the OP disagrees with the entire TDT/UDT/FDT line of thinking, since they all one-box in Newcomb’s problem and the OP thinks one should two-box.
I’m not an expert in any of the points you talk about, nevertheless I’ll give my unconfident opinions after a quick read:
Casper
If you add to the physical laws code that says “behave like with Casper”, you have re-implemented Casper with one additional layer of indirection. It is then not fair to say this other world does not contain Casper in an equivalent way.
FDT
The summary of my understanding of the specific argument that follows the quote:
Consider an FDT agent.
Consider an almost infallible predictor of the FDT agent.
The predictor says the FDT agent will behave differently than an FDT agent.
What I logically conclude from these premises is that, although improbable given only (2), (3) is sufficient information to conclude that the predictor failed. So I expect that if the FDT agent actually had some modeling of the accuracy of the predictor, instead of a number specified by fiat, it would deduce that the predictor is not accurate, and so avoid the bomb because this does not make it take inconsistent decisions with its accurately simulated copies.
I think the intuitive appeal of the counterexample resides in picking such a “dumb” FDT agent, where your brain, which is not so constrained, sees immediately the smart think to do and retorts “ahah, how dumb”. I think if you remove this emotional layer, then the FDT behavior looks like the only thing possible: if you choose an algorithm, any hardware assumed to run the algorithm accurately will run the algorithm. If there’s an arbitrarily small possibility of mistake, it’s still convenient to make the algorithm optimal for when it’s run correctly.
Consciousness
I’ve just skimmed this part, but it seems to me that you provide arguments and evidence about consciousness as wakefulness or similar, while Yudkowsky is talking about the more restricted and elusive concept of self-awareness.
I’ve written down this comment before arriving at the part where you cite Yudkowsky making the same counterargument, so my comment is only based on what evidence you decided to mention explicitly.
Overall impression
Your situation is symmetric: if you find yourself repeatedly being very confident about someone not knowing what they are saying, while this person is a highly regarded intellectual, maybe you are overconfident and wrong! I consider this a difficult dilemma to be in. Yudkowsky wrote a book about this problem, Inadequate Equilibria, so it’s one step ahead of you on the meta.
//If you add to the physical laws code that says “behave like with Casper”, you have re-implemented Casper with one additional layer of indirection. It is then not fair to say this other world does not contain Casper in an equivalent way.//
No, you haven’t reimplemented Casper, you’ve just copied his physical effects. There is no Casper, and Casper’s consciousness doesn’t exist.
Your description of the FDT stuff isn’t what I argued.
//I’ve just skimmed this part, but it seems to me that you provide arguments and evidence about consciousness as wakefulness or similar, while Yudkowsky is talking about the more restricted and elusive concept of self-awareness. //
Both Yudkowsky and I are talking about having experiences, as he’s been explicit about in various places.
//Your situation is symmetric: if you find yourself repeatedly being very confident about someone not knowing what they are saying, while this person is a highly regarded intellectual, maybe you are overconfident and wrong! I consider this a difficult dilemma to be in. Yudkowsky wrote a book about this problem, Inadequate Equilibria, so it’s one step ahead of you on the meta.//
I don’t talk about the huge range of topics Yudkowsky does. I don’t have super confident views on any topic that is controvsial among the experts—but Yudkowsky’s views aren’t, they mostly just rest on basic errors.
Upvoted, and I’m sad that this is currently negative (though only −9 with 63 votes is more support than I’d have predicted, though less than I’d wish). I do kind of which it’d been a sequence of 4 posts (one per topic, and a summary about EY’s overconfidence and wrongness), rather than focused on the person, with the object-level disagreements as evidence.
It’s interesting that all of these topics are ones that should be dissolved rather than answered. Without a well-defined measure of “consciousness” (and a “why do we even care”, for some of the less-interesting measures that get proposed), zombies and animal experience are more motte-and-bailey topics than actual answerable propositions. I find it very easy (and sufficient) to believe that “qualia is what it feels like for THIS algorithm on THIS wetware”, with a high level of agnosticism on what other implementations will be like and whether they have internal reflectable experiences or are “just” extremely complex processing engines.
Decision theory likewise. It’s interesting and important to consider embedded agency, where decisions are not as free and unpredictable as it feels like to humans. We should be able to think about constrained knowledge of our own future actions. But it seems very unlikely that we should encode those edge cases into the fundamental mechanisms of decision analysis. Further, the distinction between exotic decision theory and CDT-with-strategic-precommitment-mechanisms is pretty thin.
Which I guess means I think you’re perhaps Less Wrong than EY on these topics, but both of you are (sometimes) ignoring the ambiguity that makes the questions interesting in the first place, and also makes any actual answer incorrect.
But it obviously isn’t sufficient to do a bunch of things. In the absence of an actual explanation , you aren’t able to solve issues about AI consciousness and animal suffering. Note that “qualia is what it feels like for THIS algorithm on THIS wetware” is a belief , not an explanation—there’s no how or why to it.
Right. I’m not able to even formulate the problem statement for “issues about AI consciousness and animal suffering” without using undefined/unmeasurable concepts. Nor is anyone else that I’ve seen—they can write a LOT about similar-sounding or possibly-related topics, but never seem to make the tie to what (if anything) matters about it.
I’m slowly coming to the belief/model that human moral philosophy is hopelessly dualist under the covers, and most of the “rationalist” discussion around it are attempts to obfuscate this.
Sadly, I don’t have the time to properly engage with your arguments. However, I have been finding your recent posts to be of unusually high quality in the direction of truth-seeking and explaining clear thinking. Please keep it up and give Scott Alexander some competition.
Please take this compliment. Despite being old enough to be your parent, I think discussions over beer with you would be a delight.
I really appreciate that! Though if you like the things I write, you can find my blog at benthams.substack.com
Ommizoid’s blog is indeed high quality. Good Writing As Hypnosis, for example, is really good. I would love it if Scott Alexander had more competition.
Evil: A Reflection is good too.
I’m upvoting this because the community could use more content commonly held views, and some people do need to treat Eliezer as more fallible than they do.
That said, I find most of your examples unpersuasive. With the exception of some aspects of p-zombies, where you do show that Eliezer has misinterpreted what people are saying when they make this sort of argument, most of your arguments are not compelling arguments at all that Eliezer is wrong, although they do point to his general overconfidence (which seems to be a serious problem).
For what it is worth, one of my very first comments [Was objecting to Eliezer’s use of phlogiston as an example of a hypothesis that did not generate predictions](https://www.lesswrong.com/posts/RgkqLqkg8vLhsYpfh/fake-causality?commentId=4Jch5m8wNg8pHrAAF).
Some other discussion of his views on (animal) consciousness here (and in the comments).
I am so disappointed every time I see people using the persuasiveness filter. Persuasiveness is not completely orthogonal to correctness, but it is definitely linearly independent from it.
I never claimed Eliezer says consciousness is nonphysical—I said exactly the opposite.
Overall I’d love EY to focus on his fiction writing. He has an amazing style and way with words and “I created a mental model and I want to explore it fully and if the world doesn’t fit the model it’s the problem of the world” type of thinking is extremely beneficial there. It’s what all famous writers were good at. His works will be amazing cautionary tales on par with 1984 and Brave New World.
Oh, hi, EY, I see you found this :) Single vote with −10 power (2 → −8) is a lot. Wield that power responsibly.
My understanding from Eliezer’s writing is that he’s an illusionist (and/or a higher-order theorist) about consciousness. However, illusionism (and higher-order theories) are compatible with mammals and birds, at least, being conscious. It depends on the specifics.
I’m also an illusionist about consciousness and very sympathetic to the idea that some kinds of higher-order processes are required, but I do think mammals and birds, at least, are very probably conscious, and subject to consciousness illusions. My understanding is that Humphrey (Humphrey, 2022, Humphrey, 2023a, Humphrey, 2023b, Humphrey, 2017, Romeo, 2023, Humphrey, 2006, Humphrey, 2011) and Muehlhauser (2017) (a report for Open Phil, but representing his own views) would say the same. Furthermore, I think the standard interpretation of illusionism doesn’t require consciousness illusions or higher-order processes in conscious subjects at all, and instead a system is conscious if connecting a sufficiently sophisticated introspective system to it the right way would lead to consciousness illusions, and this interpretation would plausibly attribute consciousness more widely, possibly quite widely (Blackmore, 2016 (available submitted draft), Frankish, 2020, Frankish, 2021, Frankish, 2023, Graziano, 2021, Dung, 2022).
If I recall correctly, Eliezer seemed to give substantial weight to relatively sophisticated self- and other-modelling, like cognitive empathy and passing the mirror test. Few animals seem to pass the mirror test, so that would be reason for skepticism.
However, maybe they’re just not smart enough to infer that the reflection is theirs, or they don’t rely enough on sight. Or, they may recognize themselves in other ways or to at least limited degrees. Dogs can remember what actions they’ve spontaneously taken (Fugazza et al., 2020) and recognize their own bodies as obstacles (Lenkei, 2021), and grey wolves show signs of self-recognition via a scent mirror test (Cazzolla Gatti et al., 2021, layman summary in Mates, 2021). Pigeons can discriminate themselves from conspecifics with mirrors, even if they don’t recognize the reflections as themselves (Wittek et al., 2021, Toda and Watanabe, 2008). Mice are subject to the rubber tail illusion and so probably have a sense of body ownership (Wada et al., 2016).
Furthermore, Carey and Fry (1995) show that pigs generalize the discrimination between non-anxiety states and drug-induced anxiety to non-anxiety and anxiety in general, in this case by pressing one lever repeatedly with anxiety, and alternating between two levers without anxiety (the levers gave food rewards, but only if they pressed them according to the condition). Similar experiments were performed on rodents, as discussed in Sánchez-Suárez, 2016, in section 4.d., starting on p.81. Rats generalized from hangover to morphine withdrawal and jetlag, and from high doses of cocaine to movement restriction, from an anxiety-inducing drug to aggressive defeat and predator cues. Of course, anxiety has physical symptoms, so maybe this is what they’re discriminating, not the negative affect.
There are also of course many non-illusionist theories of consciousness that attribute consciousness more widely that are defended (although I’m personally not sympathetic, unless they’re illusionist-compatible), and theory-neutral or theory-light approaches. On theory-neutral and theory-light approaches, see Low, 2012, Sneddon et al., 2014, Le Neindre et al., 2016, Rethink Priorities, 2019, Birch, 2020, Birch et al., 2022, Mason and Lavery, 2022, generally giving more weight to the more recent work.
Thank you! We need less “yes men” here and more dissenting voices. The voice counter on this post will be deeply in negative, but that is expected—many people here are exactly in that period you’ve described as “yourself 2 years ago”.
EY is mostly right when he talks about tools to use (all the “better thinking”, rationality anti-bias ways), EY is mostly wrong when he talks about his deeply rooted beliefs in topics he doesn’t have a lot of experience in. Unfortunately this covers most of the topics he speaks about and that isn’t clearly seen due to his vocabulary and the fact that he is genuinely smart person.
Unfortunately^2 it looks like he failed to self-identify his biggest bias that I personally prefer to call “Linus Pauling” effect—when someone is really, really smart (and EY is!) he thinks he’s good in everything (even when he simultaneously acknowledge that he isn’t—probably the update value of this in his NN could really use a bigger weight!) and wants to spread the “wisdom” of everything, without understanding that IQ+rationality is crappy substitute for experience in the area.