Explanations for Less Wrong articles that you didn’t understand
I’m struggling to understand anything technical on this website. I’ve enjoyed reading the sequences, and they have given me a lot to thing about. Still, I’ve read the introduction to Bayes theorem multiple times, and I simply can’t grasp it. Even starting at the very beginning of the sequences I quickly get lost because there are references to programming and cognitive science which I simply do not understand.
Thinking about it, I realized that this might be a common concern. There are probably plenty of people who’ve looked at various more-or-less technical or jargony Less Wrong posts, tried understanding them, and then given up (without posting a comment explaining their confusion).
So I figured that it might be good to have a thread where you can ask for explanations for any Less Wrong post that you didn’t understand and would like to, but don’t want to directly comment on for any reason (e.g. because you’re feeling embarassed, because the post is too old to attract much traffic, etc.). In the spirit of various Stupid Questions threads, you’re explicitly encouraged to ask even for the kinds of explanations that you feel you “should” get even yourself, or where you feel like you could get it if you just put in the effort (but then never did).
You can ask to have some specific confusing term or analogy explained, or to get the main content of a post briefly summarized in plain English and without jargon, or anything else. (Of course, there are some posts that simply cannot be explained in non-technical terms, such as the ones in the Quantum Mechanics sequence.) And of course, you’re encouraged to provide explanations to others!
- 31 Mar 2014 11:19 UTC; 10 points) 's comment on Welcome to Less Wrong! (6th thread, July 2013) by (
Can someone explain to me the significance of problems like Sleeping Beauty? I see a lot of digital ink being spilled over them and I can kind of see how they call into question what we mean by “probability” and “expected utility”, but I can’t quite pin down the thread that connects all of them. Someone will pose a solution to a paradox X, and then another reply with a modified version X’ that the previous solution fails on, and I tend to have trouble seeing what the exact thing is people are trying to solve.
If you want to build an AI that maximizes utility, and that AI can create copies of itself, and each copy’s existence and state of knowledge can also depend on events happening in the world, then you need a general theory of how to make decisions in such situations. In the limiting case when there’s no copying at all, the solution is standard Bayesian rationality and expected utility maximization, but that falls apart when you introduce copying. Basically we need a theory that looks as nice as Bayesian rationality, is reflectively consistent (i.e. the AI won’t immediately self-modify away from it), and leads to reasonable decisions in the presence of copying. Coming up with such a theory turns out to be surprisingly hard. Many of us feel that UDT is the right approach, but many gaps still have to be filled in.
Note that many problems that involve copying can be converted to problems that create identical mind states by erasing memories. My favorite motivating example is the Absent-Minded Driver problem. The Sleeping Beauty problem is similar to that, but formulated in terms of probabilities instead of decisions, so people get confused.
An even simpler way to emulate copying is by putting multiple people in the same situation. That leads to various “anthropic problems”, which are well covered in Bostrom’s book. My favorite example of these is Psy-Kosh’s problem.
Another idea that’s equivalent to copying is having powerful agents that can predict your actions, like in Newcomb’s problem, Counterfactual Mugging and some more complicated scenarios that we came up with.
Can you explain this equivalence?
When a problem involves a predictor that’s predicting your actions, it can often be transformed into another problem that has an indistinguishable copy of you inside the predictor. In some cases, like Counterfactual Mugging, the copy and the original can even receive different evidence, though they are still unable to tell which is which.
There are more complicated scenarios, where the predictor is doing high-level logical reasoning about you instead of running a simulation of you. In simple cases like Newcomb’s Problem, that distinction doesn’t matter, but there is an important family of problems where it matters. The earliest known example is Gary Drescher’s Agent Simulates Predictor. Other examples are Wei Dai’s problem about bargaining and logical uncertainty and my own problem about logical priors. Right now this is the branch of decision theory that interests me most.
Can you formalize the idea of “copying” and show why expected utility maximization fails once I have “copied” myself? I think I understand why Newcomb’s problem is interesting and significant, but in terms of an AI rewriting its source code… well, my brain is changing all the time and I don’t think I have any problems with expected utility maximization.
We can formalize “copying” by using information sets that include more than one node, as I tried to do in this post. Expected utility maximization fails on such problems because your subjective probability of being at a certain node might depend on the action you’re about to take, as mentioned in this thread.
The Absent-Minded Driver problem is an example of such dependence, because your subjective probability of being at the second intersection depends on your choosing to go straight at the first intersection, and the two intersections are indistinguishable to you.
I don’t know about academic philosophy, but on Less Wrong there is the hope of one day coming up with an algorithm that calculates the “best”, “most rational” way to act.
That’s a bit of a simplification, though. It is hoped that we can separate the question of how to learn (epistemology) and what is right (moral philosophy) from the question of given one’s knowledge and values, what is the “best”, “most rational” way to behave? (decision theory).
The von Neumann–Morgenstern theorem is the paradigmatic result here. It suggests (but does not prove) that given one’s beliefs and values, one “should” act so as to maximize a certain weighted sum. But as the various paradoxes show, this is far from the last word on the matter.
I do not understand the point of the essay http://yudkowsky.net/rational/the-simple-truth/ . The preface says that it “is meant to restore a naive view of truth”, but all I see is strawmanning everything Eliezer dislikes. What is that “naive view of truth”?
The naive view of truth:
Some things are true, some things are false, like “My name is ‘Ben’.”—True “My name is Alfred’.”—False
When it comes to factual questions, you should believe in their truth the more you have evidence for them. If well-researched statistics indicate that one country has a higher homicide rate than another, then you should believe it (unless you have other, really good evidence to the contrary). If well-formulated studies come back in, and a certain brand of alternative medicine has been discovered to be ‘ineffectual’, then you should believe it (unless you have other, really good evidence to the contrary). One should not start arguing about “well, what is truth really?” or “how can we ever know anything really?”. If one actually thought like this, I think it was Feynman who noted that these people would soon die of starvation, because they’d never really know if that yellow thing was a banana, and that they could eat it. These arguments are simply ways of dismissing really good evidence, and you should not use them.
The purpose of the essay, is so that when you’re in an argument, you provide evidence, and the person goes “but all truth is relative” or “nothing is true, it’s all just oppression of the many by the powerful” you can send it them and say “stop evading the actual evidence!”.
… to jump off a cliff. I can certainly get behind this approach. But I doubt that this is the main point of this convoluted if entertaining essay.
-The Simple Truth
I see. Preventing tangential arguments about the nature of truth is the intended point of the essay, just poorly expressed, as far as I can tell. Thanks.
“Snow is white” is true if and only if snow is white.
A map works by being in correspondence with the territory.
There is nothing magical about the correspondence. It is completely reducible to physics.
Do you not understand what Eliezer is saying, or are you just disagreeing with it?
Correspondence appears not to involve anything supernatural, but no one knows how to reduce it,which keeps it an open philosophical question.
The correspondence between the bucket of pebbles and the sheep seems clear enough to me, and not in need of further explanation. If instead of using pebbles I count “one, two, three...” as the sheep leave the pen, and the same when they return, it is equally clear to me how that works. What is the open question?
Theories need to work for the difficult cases as well as the easy ones. The pebbles are an easy case because there is an intrinsic twoness to two pebbles and two sheep.
ETA
A harder case is when non iconic symbols are used. You can’t tell that “sheep”corresponds to sheep by examining the shape of the letters. But there are right and wrong ways to use “sheep”. It is tempting to say that a sentence is true then it corresponds, and corresponds when it is correct, and is correct when all the words in it are used correctly. But that is circular, because truth has been explained in terms of correctness.
Normativity is in general difficult to cash out reductionalistically, because to find norms you have to look outward to contexts, not inwards to implementation details.
Also, there is the problem of finding chunks of reality for thoughts and words to correspond to. Different languages slice and dice reality differently, so if reality really does contain pre-existing chunks corresponding to all known languages, it must be very complex, and suspiciously conveniently arranged for us humans. OTOH, if it doesn’t , the naive picture of sentences and thoughts corresponding to pre existing chunks has to be abandoned. But if the slicing and dicing is done by language and thought, what is reality supplying as a truth maker?
And then there the problem of what true statements about abstractions (morality, maths, etc) are corresponding to...
Not that I have a better theory than correspondence...
Show me some, and I will see what I have to say about them.
What is “intrinsic twoness”? How does this make the pebbles an easy case? I don’t need to know any numbers, to be able to match up pebbles and sheep, and to do it in my head, I only need a long enough, reproducible sequence of mental entities. A bard could just as easily use lines of epic poetry.
See edit above
This is broadening the issue from the truth of sentences to the meaning of words. On which, see this sequence.
There is no magic about how “sheep” refers to sheep. Sheep are a clearly identifiable class of things in the world, they matter to us, and we make a name for them.
The world does not provide chunks for our concepts; our concepts arise from seeing chunks. Different people, different cultures even, can see and name different chunks. That our words generally fit the world is no odder than the exact fit of a puddle of water to the depression it sits in. I see no mystery here. The world is complex. It provides vast numbers of joints we might divide it at and give names to. There is even a recreation of inventing names for things no-one has yet found it useful to name.
What is a reductive theory if truth going to reduce sentences to, if not words? Since when were meaning and truth unrelated?
I have already said that correspondence is unmagical. My point is that the naive theory, and we don’t have a reductive theory that can deal with the hard cases. You haven’t presented one.
If you are going to say that chunking arises from perception, you have already taken a step away from the fully naive theory.
How potential chunks get turned into actual chunks needs explaining.
As said in another comment, if preventing tangential arguments about the nature of truth is the intended point of the essay, I can certainly get behind the intent, if not the execution.
It would probably help to have a link to specific people/books who have inspired specific sections, to see how much they were strawmanned.
I remember talking with a few people who refused to be reasonable about anything, defending it with more and more meta nonsense, typically up to “how do you even know there is such thing as a reality or truth?” Using LW lingo, you can evaluate maps by compating then with the terrority, but what if someone’s map contains a large text “there is no such thing as territory”? Even if you show them that your map fits the territory better, they will point out that matching the territory counts as an improvement only according to your map, not according to their map, so you didn’t really demonstrate anything beyond your maps being different, which both of you already knew.
Reading the article reminded me of some of their techniques. Yeah, that didn’t really mean disproving them. But I was nice to know I wasn’t the only one who finds these discussion techniques irritating.
Yes, it would, but I am no expert in the area, hence my question here.
Well, mine does, but I am quite happy to get by with a sequence of ever-more predictive (you would call them accurate) maps. One can certainly avoid relying on the map/territory metaphysics and still behave at least as rationally as someone who does. However, I agree that
is generally a copout in reply to an argument someone is willing but unable to counter and thus holds no value when used for this purpose only. Presumably Mark in the story is one of those, though grotesquely strawmanned:
The real-life equivalents are more like: “You know who else disagreed with religion? Stalin did!” or “You know who else said there are differences between people? Hitler did!” This is supposed to somehow prove religion and disprove evolution.
The naive version (with added parenthesis) :
(‘Snow is white’ is true) if and only if (snow is white)
I really wanted to help, because you’re helping me with the free will thing, but I could only manage to skim the essay. I take it that the naive view of truth is supposed to be the disquotational or deflationary view. This is to say that the assertion
is identical content-wise to the assertion
To say that something is true is just to assert that thing, and asserting it is sufficient to say that it’s true. In other words, we can for most purposes just do without the word ‘true’ (though things are more complicated for ‘false’).
One useful distinction is between asserting a proposition and explaining its meaning. The meaning of “snow is white” can be discussed apart from the question of whether it’s true, so saying that it’s true serves to indicate that we are discussing its truth and not (just) its meaning.
As far as I can tell, Eliezer is arguing for the correspondence theory of truth.
The naive view is that “‘true’ is true.”
Trying to parse… There is some universal, objective and intuitive concept that is labeled as “true”?
I was reading Eliezer’s cartoon proof of Lob’s theorem the other day and I didn’t get it. My assumption was that in order to understand it, I would need a decent background in mathematical logic, e.g. actually know what Peano Arithmetic is as opposed to abstracting it away as a talking head that tells us things. (I know vector calculus, linear algebra, programming, and basic logic but that’s about as far as I go.) If Lob’s theorem is something that I should be able to understand the proof of given that background, I’d be interested to know that.
In order to understand it, I am currently reading Forever Undecided: A Puzzle Guide to Gödel. The book features a whole chapter about Löb’s theorem.
The book does not have any prerequisites. It starts out with plain English logic puzzles that you need to solve (detailed solutions at the end of each chapter), and later introduces you to formal logic by translating those puzzles into propositional logic.
I have not yet finished the book, so I can’t tell if it fits the purpose of understanding Löb’s theorem. But what I can already tell is that it is really engaging and fascinating. Highly recommended!
ETA: To give a taste of Raymond M. Smullyan’s style, check out his ‘World’s shortest explanation of Gödel’s theorem’:
We have some sort of machine that prints out statements in some sort of language. It needn’t be a statement-printing machine exactly; it could be some sort of technique for taking statements and deciding if they are true. But let’s think of it as a machine that prints out statements.
In particular, some of the statements that the machine might (or might not) print look like these:
P*x (which means that the machine will print x)
NP*x (which means that the machine will never print x)
PR*x (which means that the machine will print xx)
NPR*x (which means that the machine will never print xx)
For example, NPR*FOO means that the machine will never print FOOFOO. NP*FOOFOO means the same thing. So far, so good.
Now, let’s consider the statement NPR*NPR*. This statement asserts that the machine will never print NPR*NPR*.
Either the machine prints NPR*NPR*, or it never prints NPR*NPR*.
If the machine prints NPR*NPR*, it has printed a false statement. But if the machine never prints NPR*NPR*, then NPR*NPR* is a true statement that the machine never prints.
So either the machine sometimes prints false statements, or there are true statements that it never prints.
So any machine that prints only true statements must fail to print some true statements.
Or conversely, any machine that prints every possible true statement must print some false statements too.
The problem I find with all pop-level proofs of Gödel’s theorems and similar material, including this one, is that they gloss over a key component: how to make a machine that talks about itself. After the part quoted above, a blogger (not Smullyan) does go on to say:
No explanation of this essential part of the proof is given. Unless you do that part, there’s nothing in the supposed proof to limit it to systems that include arithmetic.
A few years ago, I tried to write a friendly introduction to this technical part.
Thank you for sharing that little snippet because I’m going to have to read italicsForever Undecideditalics now. Those last four lines were the photon my brain has waited for.
If you already understand Gödel’s first and second incompleteness theorems, then you can find a much simpler proof of Löb’s theorem in this pdf, pages 6-7.
There are three ways to answer the free will/determinism question: I) yes, they’re incompatible, but we have free will, II) yes, they’re incompatible, and we don’t have free will, III) they’re not incompatible.
I’ve often heard EY’s free will solution referred to as a form of (III), compatibilism. If this is the case, then I don’t think I understand his argument. So far as I can tell, EY’s solution is this:
1) free will is incompatible with determinism / the natural world is relevantly deterministic // we therefore do not have free will.
2) here is an error theory explaining why we sometimes think we do.
3) moral responsibility is a matter of convention (but, I take it, not therefore unimportant).
This is fine, if that’s the answer. But it’s not a compatibilist answer. Am I missing something? I’m comparing EY’s answer here to something like the philosopher Donald Davidson’s paper Mental Events, which is a more traditionally compatibilist view.
One thing I read Eliezer as saying, in Dissolving the Question, is that the phenomenology of free will is more interesting than the metaphysics:
(This comment is not a full answer to that “homework assignment.”)
In other words, it is a fact that humans do reasonably reliably possess the intuition, “I have free will.” We do have that intuition; our having it is something to be explained. And it is a fact that when we examine the processes that we are made of — physics — we do not (contra Penrose) see anywhere for free will to sneak in. Brains use the same atoms that billiard balls and computers do.
(I don’t know if you are a coder. A “stack trace” is a snapshot of what is going on, at a particular moment, at every level of abstraction in a computer program. Stack traces are often seen when a program crashes, to let the programmer follow the trail of the code bug or bad data that let to the crash. We might obtain a stack trace of consciousness through introspective techniques such as meditation — I’m not there yet via that particular method, but I think I can follow the arguments.)
I take Eliezer to be (heavily) influenced by Daniel Dennett. Dennett in Elbow Room, Freedom Evolves, etc. holds that what we want out of “free will” is that we create choices to influence the future; that we can take reasoned steps to avoid predicted bad outcomes; that we could have done otherwise if we thought or believed differently. This is just as incompatible with indeterminism (wherein our seeming choices are the results of quantum indeterminacy in the neurons (Penrose) as well as with a sort of greedy mechanical determinism where our choices are produced by our bodies without conscious reflection. I take Dennett as implying that our choices are produced by our bodies, but conscious reflection is our name for the mechanism by which they are produced.
(As Eliezer points out in the Anti-Zombie posts, consciousness does have an effect on the world, notably that we talk about consciousness: we discuss our thoughts, plans, fears, dreams.)
Eliezer diagrams this in Thou Art Physics: the determinist claims that the world’s future is caused by physics operating on the past, and not by me making choices. But to the computationally minded materialist, “me” is the name of the place within physics where my choices are calculated, and “me” certainly does have quite a bit of control over the future.
I am not convinced that a materialist determinist like Sam Harris or Democritus would be convinced. The fact that I draw a line around some part of physics and call it “me” doesn’t mean I control what goes on in that boundary, after all.
(Computation is vital here, because computation and (selective) correlation are never free. In order for a computation to take place, it has to take place somewhere. In order for some outputs (say, my movement towards the cookie jar) to correlate with some inputs (my visual signals about the cookie jar), that correlation has to be processed somewhere. Plot my worldline, and I am in orbit around the cookie jar with some very complex equation modeling my path; but where that equation is actually computed in order to guide my feet, is inside my brain.)
But the reason that determinism worries freshman philosophy students and novice LWers is that it seems to imply fatalism — that the choices we make don’t matter, because the universe is scripted in advance. This compatibilist view, though, seems to say that the choices we make do matter, because they are part of how the universe calculates what the future will bring.
Fatalism says we can’t change the future, so we may as well just sit on the couch playing video games. Compatibilism says that we are the means of changing the future.
Point of order—a stack trace is not a dump of everything that’s going on, just the function call stack. It’s essentially “How did I get to here from the start of the program”.
A dump of everything would be a core dump, named after “core memory”—a very, very old memory technology.
Point of order—Your comment is not a point of order. A point of order is an interjection about process in parliamentary process. Your comment was a clarification about terminology, which does not have the precedence of a point of order.
[This is meant to be silly, not harsh; but if you want to make fussy terminological points on LW, I will do likewise...]
Isn’t your comment then also not a point of order?
Muphry’s law FTW!
Thanks, that’s very helpful.
I guess this is my sticking point. After all, a billiard ball is a necessary link in the causal chain as well, and no less a computational nexus (albeit a much simpler one), but we don’t think that we should attribute to the ball whatever sort of authorship we wish to talk about with reference to free will. If we end up showing that we have a property (e.g. ‘being a necessary causal link’) that’s true of everything then we’ve just changed the topic and we’re no longer talking about free will. Whatever we mean by free will, we certainly mean something human beings (allegedly) have, and rocks don’t. So this does just strike me as straightforward anti-free-will determinism.
That may be right, and it may just be worth pointing out that determinism doesn’t imply fatalism. But in that light the intuitive grounds for fatalism seem much more interesting than the intuitive grounds for the belief in free will. I’m not entirely sure we’re naturally apt to think we have free will in any case: I don’t think anyone before the Romans ever mentioned it, and it’s not like people back then didn’t have worked out (if false) metaphysical and ethical theories.
Actually, the ancient Egyptian concept of Maat seems to include free will in some sense, as a “responsibility to choose Good”, according to this excerpt. But yeah, it was not separate from ethics.
That’s really interesting, thanks for posting it. It’s an obscure sort of notion, but I agree it’s got some family resemblance to idea of free will. I guess I was thinking mostly of the absence of the idea of free will from Greek philosophy.
I took a course on ancient and medieval ethics as an undergraduate. We spent a lot of time on free will, talking about Stoic versus Epicurean views, and then how they show up in Cicero and in Thomas. My impression (as a non-expert) is that Aristotle doesn’t have a term that equates to “free will”, but that other Greek writers very much do.
You’re right, of course, that many of those philosophers wrote in Greek. I suppose I was thinking of them as hellenistic or latin, and thinking of Greek philosophers as Plato, Aristotle, and their contemporaries. But I was speaking imprecisely.
That is because the billiard ball doesn’t have sufficient inner complexity and processes. I think the necessary complexity is the computational ability to a) model parts of the future world state and b) base behavior on that and c) model the modelling of this. The problem arises when your model of your model goes from iniuition (sensation of free will) to symbolic form which allows detection of the logical inconsistencies at some higher modelling level.
Actually little is needed to ascribe agency to ‘balls’. Just look at https://www.youtube.com/watch?v=sZBKer6PMtM and tell me what inner processes you infer about the ‘ball’ due to its complex interactions.
I agree that your (a)-(c) are necessary (and maybe sufficient) conditions on having free will.
What do you mean by this?
To my knowledge, I’ve never had this sensation, so I don’t know what to say about it. So far as I understand what is meant by free will, it’s not the sort of thing of which one could have a sensation.
Further to the other subthread, I suppose what most people mean when they talk about the sensation of free will is imagining multiple possible worlds and feeling control over which one will become actual before it does. Do you not have this?
I wouldn’t call that a sensation or a feeling, but yes. I do think I act freely, and I can recall times when I’ve acted freely. If I don’t have free will, then I’m wrong about all that.
I don’t think that’s EY’s solution—I don’t think his discussion of Free Will has anything to do with moral responsibility being a matter of convention.
From what I recall, the argument is something more like this: When people talk of “Free will”, it’s not clear what exactly they are referring to. If you try to pin down a more precise meaning that matches people’s intuitions, you get something like “the subjective sensation of evaluating different available courses of action one might take”—and that is compatible with determinism (you can run decision algorithms in a perfectly deterministic binary world e.g. a simulated tile-based game world).
Does that make sense?
Yes, though this...
...seems obviously false to me, not least because it’s a category error. This sounds like a description of the sensation of having free will, not free will. I don’t think anything can be identical to the sensation of itself, but even if so, that’s not what’s going on with free will.
And that’s not what people mean by free will, even if it might be all there is to free will. I think most people concerned with the question would say that having such a sensation is neither necessary nor sufficient for having free will, if this means something like ‘being the ultimate author of your actions’ or ‘being able to evaluate your options’ or something like that.
There’s a difference between being confused about what you mean, and being confused about what something is. The libertarian about free will might be confused about what free will is (i.e. thinking it a metaphysical property rather than an illusion brought about by a certain perspective), but she’s not therefore confused about what she means by free will. And if you clear up her confusion by showing her that the impression that she’s free is merely an impression, then she’ll reasonably conclude that ‘it turns out I don’t have free will after all, but merely the sensation of having it.’
Agreed . Philosophy is difficult. A major part of the difficulty is understanding the questions. A common failure mode of amateurs is to solve the wrong problem.
I would like to know how you understand free will. But since philosophical definitions are generally useless, I would appreciate if you can give a few examples of not having free will. I asked this question on this forum once before, and some people struggled to described the sensation of not having free will, and those who did describe it gave very different descriptions. I am quite sure that your examples of lack of free will would clash with those of others, indicating that the notion is too poorly defined to be discussed productively.
This is a general problem with intuitively understood notions, people use the same term for overlapping but different concepts and/or qualia. “Existence” is one of the worst. People use it in different and contradictory ways, often in the same sentence (this is easy to detect by asking “what does it mean for to not exist?”).
Anyway, just wondering about 3 examples which you personally think describe not having free will.
Alright, that sounds like a good idea. I don’t think there can be a sensation of having or not having free will, any more than there can be a sensation of drawing a conclusion from premises. In a loose sense, there might be an ‘experience’ of these things, in the sense that I might remember having done so. But if drawing an inference produces in me some sensation or other, that’s totally accidental. Same goes, I think, for acting freely.
1) I throw a rock down a mountain and kill Tom. The rock does not have free will.
2) You throw me down a mountain, and I land on Tom, killing him. I had free will, but my will isn’t part of what killed Tom, any more than is the will of Suzie who wasn’t involved.
3) When you threw me down the mountain, you acted thus under duress. You still acted freely, though we might not want to say that you were morally responsible.
4) If you were being mind controlled by a sinister alien, then you’re in no different a situation than and I was in (2): you didn’t act at all. There are tricky but unimportant borderline cases to be found in the territory of influence by, say, drug addiction.
Here’s another intuition of mine:
I throw a large rock down a steep mountain. At the bottom of this mountain, the rock strikes Tom and kills him. Carl, brother of Tom, sets out to find out why Tom died. He would and should not be satisfied to learn that a rock struck Tom. Carl would and should be satisfied to learn that I throw that rock. He should not seek any further, upon learning this. If, contra-factually, I did not throw the rock and it fell by chance, then Carl should be satisfied to learn this.
If asked why his brother died, Carl should answer ‘Hen killed him by throwing a rock down a hill’. If, contra-factually, I did not throw the rock but it fell on its own, then Carl can rightly answer ‘A rock fell on him’, or pick arbitrarily from any one of the string of antecedent causes that precipitated Tom’s death, or just say ‘by chance’ or ‘by accident’. The specialness of myself in this story is due to the fact that I have free will, and that I threw the rock freely. The rock, the texture of the mountain, my parents and ancestors, and the great many necessary antecedents to Tom’s death are all computational nexuses in what led to Tom’s death, but I’m the only one that’s special in this way (so being a necessary computational nexus isn’t sufficient, but is perhaps necessary, to being the free cause of something).
Huh. I thought the whole starting point of the debate was that people do have “a sensation of having … free will” and argue what this free will thing is.
Example 1: sure, rocks are probably not interesting objects to ascribe free will to, though a good start. I am more interested in humans not having free will.
Example 2: this is not an example of NOT having free will
Example 3: this is not an example of NOT having free will, as you explicitly state.
Example 4: “mind controlled by a sinister alien” is, as I understand, where you feel you have free will, but “in fact” you do not. This seems identical to “mind controled by indifferent laws of physics” and implies that free will is a pure sensation, since everything is controlled by the laws of physics in a non-dualistic picture. Or am I missing something in your logic? Or does it only work if you are mind-controlled by something that has free will, and then there are turtles all the way down?
Seems critically important to me, as described above.
The story about Carl, Tom and you is not an example of any of the agents NOT having free will, so it does not help.
I’d never seen the problem put this way before reading the free will sequence, so that wasn’t my impression. But I can hardly claim to have a representative sample of free will discussions under my belt.
Then maybe I didn’t understand your request. Could you clarify what you’re looking for? In any case, here’s a no-exhaustive list of things I think don’t have free will: rocks and things like them, plants, most and probably all non-human animals (I’m not really sure), human beings who are very young (i.e. prelinguistic), human beings with severe cognitive impairment (human beings in regard to whom we would correctly say that they don’t have things like beliefs, reasons, justifications etc.)
Here’s a non-exhaustive list of things that I think do have free will: all human beings regardless of circumstances, excepting those on the first list.
So I’m not sure if I can give you an example of what you’re looking for. Do you mean ‘give me a situation in which someone who is normally apt to have free will doesn’t?’ If so, then my answer is ‘I can’t, I don’t think that’s possible.’ I’m not sure about that, of course, given the magnitude of a ‘that’s impossible’ claim, so please offer any counterexamples you can think of.
I also don’t think I’m being at all radical in my understanding of free will here. We don’t always act freely, of course. Sometimes stuff just happens to us, and I take it we can tell the difference between hurling ourselves at Tom, and being thrown at Tom by someone else. So maybe there’s something like an experience of acting freely, though I certainly don’t think that it’s a sensation.
No, I don’t think the feelings of the agent are relevant. The feeling of having free will, if there is any such thing, is neither necessary nor sufficient for having free will.
I did not want to prime you with my own examples, but since you say
I will try. Here are some examples where you might feel not having free will, which is an easier question to answer than objectively not having free will, since all you have to do is ask someone about what they feel and think. Note that different people are likely to give different answers.
The voices in your head tell you to stab someone and you are compelled to do it. Subcases: a) you don’t want to do it, but do it anyway, b) you don’t care either way and just do it, c) you don’t understand what wanting or not wanting even means when the voices in your head make you do things. I’m sure there are more alternatives.
You see your arm rising and punching someone, without consciously deciding to do so. Again, various subcases are possible.
You instinctively yelp when startled, before having time to decide whether you should.
Ah, I see. It seems to me that all those examples are like my example (2), where someone has free will but is not presently exercising it (however much it may appear that they are). I agree that, on the face of it, those all seem to me to be examples of not exercising free will. One could be in those situations while having free will all the same. I could, for example, watch my arm punch someone, and yet be freely writing a note to my mother with the other arm.
I am unclear about the difference between not having and not exercising free will. Are you saying that going with the default choice (whatever it might mean) is not exercising free will? Or the inability to see choices is not exercising free will? Or seeing choices and wanting to choose but being unable to do so is “not exercising”?
No, nothing so complicated. Say I meet Tom on the street. Tom is angry with me, so he pushes me over and I smash a pigeon. Did I smash the pigeon? Yes, in a way. But my smashing a pigeon wasn’t an action on my part. It wasn’t an exercise of my will. Yet I have free will the whole time. Tom’s pushing me over can’t do anything about that.
This isn’t quite a parallel case, but my smashing the pigeon isn’t an exercise of my mathematical knowledge either, but the fact that I’m not exercising my mathematical knowledge doesn’t mean I don’t have it. Does that make sense? I guess I always understood ‘having free will’ as something like having a capacity to act, while ‘exercising free will’ means acting. But I don’t have any reflective depth there, it’s just an assumption or maybe just a semantic prejudice.
I think this is his conclusion:
This sounds pretty compatibilist to me. EY gives a definition of free will that is manifestly compatible with determinism. Elsewhere in that post he argues that different definitions of free will are nonsensical and are generated by misleading intuitions.
But as the quote demonstrates, and as discussed in a different post, EY is less interested in providing a definition for free will and then asserting that people do or do not possess free will, and more interested in explaining in detail where all the intuitions about free will come from, and therefore why people talk about free will. He suggests that if you can explain what caused you to ask the question “do we have free will?” in the first place, you may not need to even bother to answer the question.
True, and EY seems to be taking up Isaiah Berlin’s line about this: suggesting that the problem of free will is a confusion because ‘freedom’ is about like not being imprisoned, and that has nothing to do with natural law one way or the other. I absolutely grant that EY’s definition of free will given in the quote is compatible with natural determinism. I think everyone would grant that, but it’s a way of saying that the sense of free will thought to conflict with determinism is not coherent enough to take seriously.
So I don’t think that line makes him a compatibilist, because I don’t think that’s the notion of free will under discussion. It’s consistent with us having free will in EY’s sense, that all our actions are necessitated by natural law (or whatever), and I take it to be typical of compatibilism that one try to make natural law consistent with the idea that actions are non-lawful, or if lawful, nevertheless free. Maybe free will in the relevant sense a silly idea in the first place, but we don’t get to just change the topic and pretend we’ve addressed the question.
And he does a very good job of that, but this work shouldn’t be confused with something one might call a ‘solution’ (which is how the sequence is titled), and it’s not a compatibilist answer (just because it’s not an attempt at an answer at all).
I’m not saying EY’s thoughts on free will are bad, or even wrong. I’m just saying ‘It seems to me that EY is not a compatibilist about free will, on the basis of what he wrote in the free will sequence’.
What exactly is the notion of free will that is under discussion? Or equivalently, can you explain what a “true” compatibilist position might look like? You cited this paper as an example of a “traditionally compatibilist view,” but I’m afraid I didn’t get much from it. I found it too dense to extract any meaning in the time I was willing to spend reading it, and it seemed to make some assertions that, as I interpreted them, were straightforwardly false.
I’d find a simple explanation of a “traditional compatibilist” position very helpful.
Well, I suppose I picked a form of compatibilism I find appealing and called it ‘traditional’. It’s not really traditional so much as slightly old, and related to a very old compatibilist position described by Kant. But there are lots of compatibilist accounts, and I do think EY’s probably counts as compatibilist if one thinks, say, Hobbes is a compatibilist (where freedom means simply ‘doing what you want without impediment’).
A simple explanation of a version of compatibilism:
So, suppose you take free will to be the ability to choose between alternatives, such that an action is only freely willed if you could have done otherwise. The thought is that since the physical universe is a fully determined, timeless mathematical object, it involves no ‘forking paths’. Now imagine a scenario like this, courtesy of a the philosopher who came up with this argument:
The thought is, Jones is responsible for shooting Smith, he did so freely, he was morally responsible, and in every way one could wish for, he satisfied the notion of ‘free will’. Yet there was no ‘fork in the road’ for Smith, and he couldn’t have chosen to do otherwise. Hence, whatever kind freedom we’re talking about when we talk about ‘free will’ has nothing to do with being able to do otherwise. This sort of freedom is wholly compatible with a universe in which there are no ‘forking paths’.
Having thought about it some more… Eliezer (and Scott Aaronson in The Ghost in the Quantum Turing Machine) agrees that free will is independent of determinism (since being forced to act randomly does not mean that you choose freely), so that’s reasonably compatibilist. Here is a quote from the above paper:
The introduction to the paper is also quite illuminating on the subject:
Thanks, that is very illuminating. I think with this in mind I can refine what I’m trying to talk about a bit more. So lets similarly distinguish between freedom and free will.
By ‘freedom’ lets stipulate that we mean something like political freedom. So one has political freedom if one is not prohibited by law from doing things one ought to be able to do, like speaking one’s mind. Likewise, freedom in general means not being constrained by thugs, or one’s spouse, or whatever.
Let’s take up Aaronson’s understanding of ‘free will’: first, your actions are determined by you as opposed to arbitrarily pursued (they are willed), and second, your actions are determined by you alone (they are freely willed).
I don’t think Aaronson’s point about skepticism is a good one. I don’t think the skeptic could always deny that the decision was really yours, so long as we agreed on what ‘yours’ means. They could deny that an action was yours on the basis of, say, it’s appearance to a third party, but we shouldn’t worry about that. I also don’t think past confusion or disagreement about free will is a good reason to get discouraged and abandon the idea.
So maybe this will be helpful: in order for an action to be freely willed in the primary case, it must follow from reasons. Reasons are beliefs held true by the agent and related to one another by inferences. To give a simple example, one freely wills to eat a cookie when something like the following obtains:
1) I like cookies, and I eat them whenever I can. 2) Here is a cookie! 3) [eating it]
It follows from (1) and (2) that I should eat a cookie, and (3) is the eating of a cookie, so (3) follows from (1) and (2). Anything capable of acting such that their action has this kind of rational background (i.e. (1) and (2) and the inference connecting them) has free will. One acts freely, or exercises free will, when one acts on the basis of such a rational background. If we cannot correctly impute to the agent such a rational background for a given action, the action is not freely willed. I’m taking pains to describe free will both as precisely as I can, and in such a way that I say nothing radical or idiosyncratic, but I may be failing.
I grant, of course, that freedom is consistent with determination by natural law. The question is, is free will similarly consistent. I myself am a compatibilist, and I think free will as I’ve described it is consistent with a purely naturalistic understanding of the world. But I don’t see how EY is arguing for compatibilism.
Pick a favorite ice cream flavour. Now tell me it (it’s chocolate, great) and let’s make a “theory of ice cream preferences”. It reads “Hen favours chocolate above others”. I go to a third person (let’s call him Dave) and tell a story about how I got a divine mission to make a fact in the paper to come true. I wave my magic wand and afterwards Dave checks that indeed the mission was accomplished. All a natural law is a description how things happen, the law itself is not the cause of it’s truth. If you try to ask “Did hen or the law of ice cream picking reduce the options to chocolate?” the answer is going to be yes, but mostly figurative for the side of law and not “literally”. All natural law is descriptive and not imperative.
It is not so that people first ponder what they want to do and then pass it by the censor of “universe” in whether they are allowed to do it or not. People simply just do things. I have also seen that some hold that everything that is determined can not be the result of a free choice. This seems silly when you try to pick a ice cream flavour as making so that “Hen prefers chocolate” would rob you of your agency as now it has become “determined”. It would also mean that any such choice would be impossible to make if it were to remain truly “free”. Things will happen a certain way, not 0 ways or 2 ways but 1 way and that is separate from our ability to describe it and whatever happens is the way that happens and a statement describing that would be a true natural law.
or alternative version “1) free will requires determinism / natural world is relevantly determistic / we can excerise free will 2) Here is a theory why people form funny and incorrect beliefs while dealing with their ability to make choices 3) Don’t let error sources cloud your understanding and use you brain to clear any fogginess up”
This is very interesting.
So, are you saying that the natural world (ourselves included) don’t ‘obey’ any sort of law, but that natural law is just a more or less consistent generalization about what does happen?
So, let me ask you a question: would you say there’s any such thing as a physical impossibility that is not also a logical impossibility?
I hope this isn’t the structure of EY’s point, since then I think the sequence has nothing of substance to say.
Yes, the natural doesn’t obey any laws. If electrons started suddenly to be positively charged you couldn’t call them naughty (or they would just fall under the jurisdiction of another description maybe that of positrons). But apparently they are not that fiddly, but they totally could be. You can’t use the laws to compel events, an attourney is of no use. If something that contradicts with a law happens the law is just proven false.
If physical impossibility means inconsistent with how we view the worlds mechanics now then sure there are things like the flyby anomaly that is events that are forbidden by natural law but they still happen anyway. That is not a logical contradiction. However if something else is meant I am afraid I don’t grasp that. That somehow you could not list any mechanics but it still would make logical sense? When “mechanics” is understood broadly I have trouble imagining what that could be.
So it seems to me you’re denying that the world is in any sense deterministic, and so it’s perfectly possible for human beings to be anomalous agents, just because everything is potentially anomalous. Is that right?
By charitable reading, it’s not what ze’s saying.
From the standpoint of a person making discoveries, it is known from many observations that Bob the Particle will always Wag. Thus, “Bob Wags” is stated as a Natural Law, and assumed true in all calculations, and said with force of conviction, and if some math implies that Bob didn’t Wag, the first thing to look for is errors in the math.
However, still from the same standpoint, if some day we discover in some experiment that Bob didn’t Wag, and despite looking and looking they can’t find any errors in the math (or the experiment, etc.), then they have to conclude that maybe “Bob Wags” is not fully true. Maybe then they’ll discover that in this experiment, it just so happens that Julie the Particle was Hopping. Thus, our hypothetical discoverer rewrites the “law” as: “Bob Wags unless Julie Hops”
Maybe, in some “ultimate” computation of the universe, the “True” rule is that “Bob Wags unless someone else Leers, and no one Leers when Julie Hops”. How do we know? How will we know when we’ve discovered the “true” rules? Right now, we don’t know. As far as we know, we’ll never know any “true” rules.
But it all boils down to: The universe has been following one set of (unknown) rules since the end of time and forever ’till the end of time (ha! there’s probably no such thing as “time” in those rules, mind you!), and maybe those rules are such that Bob will Wink when we make John Laugh, and then we’ll invent turbines and build computers and discuss the nature of natural laws on internet forums. And maybe in our “natural laws” it’s impossible for Bob to Wink, and we think turbines work because we make Julie Hop and have Cody Scribble when Bob doesn’t Wag to stop him. And some day, we’ll stumble on some case where Cody Scribbles, Bob doesn’t Wag, but Bob doesn’t Wink either, and we’ll figure out that, oh no!, the natural laws changed and now turbines function on Bob Winks instead of Cody Scribbles, and we have to rethink everything!
The universe doesn’t care. Bob was Winking all along, and we just assumed it was the Cody Scribbles because we didn’t know about Annie. And never there was a case where Bob Wagged and Winked at the same time, or where Bob failed to Wag when Julie Hopped. We just thought the wrong things.
And if in the future we’ll discover other such cases, it’s only because the universe has been doing those things all along, but we just don’t see them yet.
And it’s even possible that in the future Bob will marry Julie and then never again Wink… but all that means is that the rules were in fact “Bob Winks when Annie Nods unless Bob is Married to Julie”, rather than “Bob Winks when Annie Nods”, and yet our scientists will cry “The laws of physics have changed!” while everyone else panics about our precious turbines no longer working all of a sudden.
Hmm, if I understand you correctly, you’re saying that the universe does obey natural laws (not necessarily the ones we think of as laws, of course) in the sense that if we were to understand the universe completely, we would see that there are physical impossibilities (that aren’t logical impossibilities).
Maybe that is what Slider was saying, and it’s certainly implied by “You can’t use the laws to compel events, an attourney is of no use. If something that contradicts with a law happens the law is just proven false.” But forgive me if I misunderstood, because it’s quite difficult to disentangle the issue of whether or not the universe is lawful from the (for our purposes irrelevant) question of whether and how we know those laws, or what we ought to do when something we’ve called a law appears to be violated.
I have trouble because you are using language where law comes first and happening comes second where I think happenings come first and law comes second. I was also answering a question different from whether a law is an accurate descripion of the events. I was answering a question on which one depends on the other with law being secondary.
My imagination is also failing to picture what it would even mean for the universe not to be lawful when “law” is taken broadbly and can contain arbitarily many details. Often the question is posed on the context of simple laws. But when you ask in priciple, such things as “brute force laws” that simply list all world events need to be considered too. The universe comes to a certain state and then doesn’t know how to reach a next state and so never leaves that state? For each state transition there would be a brute force law that would be correct. I can’t imagine how the world could be anything without that way of being establishing a character for it.
So, let’s get back to a more basic question, and I apologize for how pedantic this will sound. I really don’t intend it that way. I just need to know something about your approach here. Anyway: do you think we have any reason to believe that an apple when dropped will, ceteris paribus, fall to the ground?
Yes, but with a big but.
We don’t know what “all else equal” means. Inparticular it encompasses unknown and ununderstood phenomena. And because we can’t replicate what we don’t know (and for various other reasons) each apple dropping we do could in principle be dependant on some arcane detail of the universe. We coulde also very easy be driven to a situation where our beliefs of solidity would be more relevant/override the understanding of falling (such as having the apple on a table). Also the mental structures used in that assesment would be suspect. It is in fact more so that the ground accelerates into the apple and the apple is the one staying stationary. We could also question whether talk of apples and grounds make sense. Also there are known risks such as false vacuums happening so that the dropping doesn’t have time to take place. You could also drop an apple in orbit and it would not fall to the ground.
But even if the apple would just stay in midair that would be lawful. It doesn’t but it very well could have. There is not a reaction the apple can make that would count as “too chaotic” to be a lawful response (or we are not using lawful in that sense here (and indeed the theorethical notion of chaos applies to systems determined to great precision)).
Some laws are straight out of question as they simply are not the case. However among those that are consistent with our data there is not a clear winner. And it will always be underdetermined.
Assume that we know everything there is to know about the world, but nothing about the future. Everything we’ve observed (and we’ve observed everything so far) tells us that in cases like this one, the apple always falls to the ground. Do we have any reason at all to believe that it will fall to the ground this time? In other words, do we have any reason at all to think that the future will resemble the past? If so, what?
Yes we know the character of the world and so know that the apple will fall.
Knowing who you are and what (and why) you do doesn’t affect what you could have done. The idea that free will is somehow against determinism often seems to boil down as if only things that we don’t know how they work are “free”. That is as if knowing a thing would affect it directly. The thing that makes things tick and our desciption on how things tick are two separate things. The Force doesn’t have the constitution of a formula. In order to exercise will you would need to have some action be correlated with your will state. If there is no such correlation your will is powerless. A natural law doesn’t come (atleast directly (you could get killed for holding a solar-centric worldview)) to interfere with those correlations. Thus there is nothing will limiting about knowing how things work.
Okay, so the world has a character. Lets take all the facts about the character of the world together; this is what I’m calling ‘natural laws’. The world obeys natural law in the sense that the world obeys its own character: the character of the world determines how things go. Does that sound right to you?
Yes, it sound right. Tried to reread the thread on whether there is more than terminology confusion going on. To me it’s not obvious that there is a contraposition between will and determinism. And I am guessing what kind of silliness is employed to get to that end result. It seems like a “one and only one can win” situation is constructed but I can describe the same situation so that both win.
I was saying that you being told your character (correctly) is not dangerous or limiting. It means that you have a character and it’s harder to pretend as if you could do everything. However the option would be to not have any character. And that isn’t omnipotence that would be nilpotence. For some purposes you can forget what the black box contains but to claim that fundamentally the black box doesn’t work in any way? A common situation is that you don’t know how it works or that it must work somehow exoticly.
You could also say that it isn’t the case of character of not-you making the character of you nilpotent or unnecceary. It’s a question of character of all overlapping with the character of you (which it kinda obviously needs to do).
If you are anomalous you would have to be anomalous in some way and then that way would be a law, so no.
Well, by ‘anomalous’ I just mean ‘doesn’t obey any law’. I think maybe this was a poor choice of words. At any rate, in the great grandparent you said
I’m not sure what you want to say now.
This was to mean that laws obey the natural rather than the other way around in responce to >So, are you saying that the natural world (ourselves included) don’t ‘obey’ any sort of law, but that natural law is just a more or less consistent generalization about what does happen?
Maybe someone already mentioned it below, but UDT provides a formal sense in which to speak of the consequences of possible actions in a deterministic universe. I think it resolves the free will “paradox” to the extent it is meaningful at all.
Can you explain how having a formal way in which to speak of the consequences of possible actions in a deterministic universe resolves the free will problem?
Well, it depends on how you define “the free will problem”. The problem I’m talking about is the ability to assign (moral) values to possible actions: if we can only speak of one possible action then those values don’t exist (or defined only for one action in which case they’re useless anyway).
I’ll take a shot at it:
1) On the one hand, the natural world of which we are a part is governed by laws, in the sense that any causal relations within the natural world obey law-like (if probabilistic) principles. Any effect is the necessary result of some sufficiently rich set of antecedent causes plus the laws that govern the relations between them. Human beings are natural objects, subject to the same physical laws as everything else. Further, our minds are likewise the product of law like causal relationships.
2) On the other hand, human thought and action does not obey law-like principles, except normatively. Nothing we do or think is the result of necessity.
(1) and (2) seem to be inconsistent. Either one of the two is false, or they merely appear inconsistant.
That’s the problem of free will as I understand it.
As far as I understand things, 2 isn’t true. We feel ourselves making decisions, but that is how the process feels from the inside. From the outside, it’s all done by all those atoms and particles inside our skulls, bouncing off each other in strict accordance with the laws of nature like so many tiny billiard balls, and so everything we do and think is the result of necessity. All the ambivalence we feel – should I choose lemon or mango ice cream? – also consists of patterns in the bouncing of the balls, all on their paths which have been determined by the ‘laws’ of nature since the Big Bang. Even though I feel myself making a decision – lemon ice cream! –, what I will decide has been determined by all the ball-bouncing that has been going on in the universe before the moment I decide. I could not possibly choose anything else.
How would you characterize your thoughts about free will then? Is it a mere illusion, or is there something genuine in it?
Determinism is the outside view; free will is what it feels like from the inside. Right now I’m typing this comment, and it certainly feels to me like I am deciding what to say, i.e., I feel I have free will. Taking the outside view, what I’m writing has been inevitable since the Big Bang, i.e., it has been determined.
That’s not quite an answer to my question.
Oh, sorry! I misread your question. You’re asking if I think free will is an illusion. I guess you could say that yes, I think it doesn’t really exist, because we make decisions and take actions because of our thoughts and feelings, which are ultimately ‘just’ processes within our brains, which are subject to the laws of physics. Like I said, from the moment of the Big Bang it has been inevitable that I would come to write this comment. It’s mind-boggling, really. Also mind-boggling is the amount of time I’ve already spent writing and thinking about and rewriting (and rere[...]rewriting this comment, that’s why it is so late.
To me, (2) seems obviously false. You cannot predict what you’re going to do before you decide what you’re going to do. Therefore from an inside view, it seems to be unpredictable. But from an outside view, it is perfectly predictable.
I didn’t say anything about predictability though. To my mind, prediction is not relevant to free will either way.
I am happy to contribute explanations of causal matters people are confused about.
I still haven’t found a readable meta-overview of causation. What I would love to be able to read is a 3-10 pages article that answers these questions: what is causation, why our intuitive feeling that “A causes B” is straightforward to understand is naive (some examples), why nevertheless “A causes B” is fundamental and should be studied, what disciplines are interested in answering that question, what are the main approaches (short descriptions with simple lucid examples), which of them are orthogonal/in conflict/cooperate with each other, example of how a rigorous definition of causality is useful in some other problem, major challenges in the field.
Before I’m able to digest such a summary (or ultimately construct it in my own head from other longer sources if I’m unable to find it), I remain confused by just about every theoretical discussion of causation—without at least a vague understanding of what’s known, what’s unknown, what’s important and what’s mainstream everything sounds a little sectarian.
1) Do you understand the standard story about the thermodynamic arrow of time? Wikipedia:
2) Do you understand the standard story about the smoking/tar/cancer example in Pearl’s theory of causality? If not, here’s a good explanation.
For anything more advanced than that, Ilya is probably your best bet :-)
1) yes 2) no, and I’ll read through Nielsen’s post, thanks. I’ve been postponing the task of actually reading Pearl’s book.
I find the Socratic approach useful for bridging gaps, do you?
I sense there may be a contradiction between a decision theory that aims to be timeless and the mandate to ignore sunk costs because they’re in the past. But I fear I may be terribly misunderstanding both concepts.
Yes, that might be a genuine contradiction, and ignoring sunk costs might be wrong. Can you try to come up with a simple decision problem that puts the two into conflict?
I don’t see this contradiction. In a timeless decision theory, the diagram and parameters are not the same when X is in control of resource A (at “time” T) and when X is not in control of resource A (at time T+1).
The “timeless” of the decision theory doesn’t mean that the decision theory ignores the effects of time and past decisions. Rather, it refers to a more technical (and definitely more confusing) abstraction about predictions and kind of subtly hints at a reference to the (also technical) concept of symmetry in physics.
Mainly, the point is to deflect naive reasoning in problems involving predictions or similar “time-defying” situations. The classic example is newcomblike problems, specifically Newcomb’s Problem. In these situations, acting as if your current decision were a partial cause of the past prediction, and thus of whether or not Omega/The Predictor put a reward in a box, leads to better subjective chances of finding a reward in said box. The “timeless” aspect here is that a phenomenon (the decision you make) is almost looks like it’s a cause of another (the prediction of your decision) that happened “in the past”.
In fact, however, they have a common prior cause: the state of the universe and, particularly, of the brain / processor / information of the entity making the decision, prior to the prediction. Treating it as, and calling it, “timeless” helps avoid issues where this will turn into a debate about free will and determinism.
In newcomblike problems, an event B happenes where Omega predicts whether A1 or A2 will happen, based on whether C1 or C2 is true (two possible states of the brain of the player, or outcomes of a simulation). Then, either A1 or A2 happens, based on whether C1 or C2 is true, as predicted by Omega. Since the player doesn’t have the same means as Omega to know C or B, he must decide as if A caused C which caused B, which could be roughly described as a decision causing the result of a prediction of this decision in the past.
So, back to the timeless VS sunk costs “contradiction”: In a sunk costs situation, there is no Omega, there is no C, there is no prediction (B). At the moment of decision, the state of the game in abstract is something more like: “Decision A caused Resource B to go from 5 to 3, 1B can be paid to obtain 2 utilons by making decision C1, 2B can be paid to obtain 5 utilons by making decision C2″. There’s no predictions or fancy delusions of affecting events that caused the current state. A caused B(5->3) caused (NOW) caused C. C has no causal effect on (NOW), which has no causal effect on B, which has no causal effect on A. No amount of removing the timestamps and pretending that your future decision will change how it was predicted is going to change the (NOW) state.
I could go on at length and depth, but let’s see how much of this makes sense first (i.e. that you understand and/or that I mis-explained).
I’d be happy to answer questions about decision theory.
Hi, can you explain EDT to me (by email)? :)
As far as I can reconstruct EDT’s algorithm, it goes something like this:
1) I know that smoking is correlated with lung cancer.
2) I’ve read in a medical journal that smoking and lung cancer have a common cause, some kind of genetic lesion. I don’t know if I have that lesion.
3) I’d like to smoke now, but I’m not sure if that’s the best decision.
4) My friend, a causal decision theorist, told me that smoking or not smoking cannot affect the lesion that I already have or don’t. But I don’t completely buy that reasoning. I prefer to use something else, which I will call “evidential decision theory”.
5) To figure out the best action to take, first I will counterfactually imagine myself as an automaton whose actions are chosen randomly, taking into account the lesion that I have or don’t, using the frequencies observed in the world. So an automaton with the lesion will have a higher probability of smoking and a higher probability of cancer.
6) Next, I will figure out what the automaton’s actions say about its utility, using ordinary conditional probabilities and expected values. It looks like the utility of automatons that smoke is lower than the utility of those that don’t, because the former ones are more likely to get cancer.
7) Now I will remember that I’m not an automaton, and choose to avoid smoking based on the above reasoning!
Does that make sense?
The problem with this line of reasoning is that the desire to smoke is correlated with smoking, and therefore with the genetic lesion. Since and EDT agent is assumed to perform Bayesian updates, it should update its probability of having the lesion upon the observation that it has a desire to smoke.
How much it should update depends on its prior.
If, according to its prior, the desire to smoke largely screens off the correlation between the lesion and smoking, then the agent will choose to smoke.
Sorry, are you saying that EDT is wrong, or that my explanation of EDT is wrong? If it’s the former, I agree. If it’s the latter, can you give a different explanation? Note that most of the literature agrees that EDT doesn’t smoke in the smoking lesion problem, so any alternative explanation should probably give the same result.
The latter. The objection that I described is known as “tickle defense of EDT”.
Keep in mind that EDT is defined formally, and informal scenarios typically have implicit assumptions of probabilistic conditional independence which affect the result.
By making these assumption explicit, it is possible to have EDT smoke or not smoke in the smoking lesion problem, and two-box or one-box in Newcomb’s problem.
In fact the smoking lesion problem and Newcomb’s problem are two instances of the same type of decision problem, but their presentations may yield different implicit assumptions: in the smoking lesion problem virtually anybody makes assumptions such that smoking is intuitively the optimal choice, in Newcomb’s problem there is no consensus over the optimal choice.
OK, thanks. Though if that’s indeed the “proper” version of EDT, then I no longer understand the conflict between EDT and CDT. Do you know any problem where EDT+tickle disagrees with CDT?
CDT essentially always chooses two-box/smoke in Newcomb-like problems, in EDT, the choice depends on the specific formalization of the problem.
Thanks, this mostly agrees with my understanding of “naive EDT.” Are you aware of serious efforts to steelman EDT against confounding issues? Smoking lesion is the simplest example, but there are many more complicated ones.
I haven’t seen any good attempts. If someone else was asking, I’d refer them to you, but since it’s you who’s asking, I’ll just say that I don’t know :-)
I have heard a claim that UDT is a kind of “sane precomputed EDT” (?). Why are “you” (they?) basing UDT on EDT? Is this because you are using the level of abstraction where causality somehow goes away, like it goes away if you look at the universal wave function (???). Maybe I just don’t understand UDT? Can you explain UDT? :)
I am trying very very hard to be charitable to the EDT camp, because I am sure there are very smart people in that camp (Savage? Although I think he was aware of confounding issues and tried to rule them out before licensing an action. The trouble is you cannot do it with just conditional independence, that way lie dragons). This is why I keep asking about EDT.
I’ll try to explain UDT by dividing it into “simple UDT” and “general UDT”. These are some terms I just came up with, and I’ll link to my own posts as examples, so please don’t take my comment as some kind of official position.
“Simple UDT” assumes that you have a set of possible histories of a decision problem, and you know the locations of all instances of yourself within these histories. It’s basically a reformulation of a certain kind of single-player games that are already well known in game theory literature. For more details, see this post. If you try to work through the problems listed in that post, there’s a good chance that the very first one (Absent-Minded Driver) will give you a feeling of how “simple UDT” works. I think it’s the complete and correct solution to the kind of problems where it’s applicable, and doesn’t need much more research.
“General UDT” assumes that the decision problem is given to you in some form that doesn’t explicitly point out all instances of yourself, e.g. an initial state of a huge cellular automaton, or a huge computer program that computes a universe, or even a prior over all possible universes. The idea is to reduce the problem to “simple UDT” by searching for instances of yourself within the decision problem, using various mathematical techniques. See this post and this post for examples. Unlike “simple UDT”, “general UDT” has many unsolved problems. Most of these problems deal with logical uncertainty and bounded reasoning, like the problem described in this post.
Does that help?
ETA: I notice that the description of “simple UDT” is pretty underwhelming. If you simplify it to “we should model the entire decision problem as a single-player game and play the best strategy in that game”, you might say it’s trivial and wonder what’s the fuss. Maybe it’s easier to understand by comparing it to other approaches. If you ask someone who doesn’t know UDT to solve Absent-Minded Driver or Psy-Kosh’s problem, they might get confused by things like “my subjective probability of being at such-and-such node”, which are part of standard Bayesian rationality (Savage’s theorem), but excluded from “simple UDT” by design. Or if you give them Counterfactual Mugging, they might get confused by Bayesian updating, which is also excluded from UDT by design.
Thinking about this.
It seems to me that talking about EDT, causality and universal wavefunctions is overcomplicating things a little. Let me just describe a problem that could motivate the creation of UDT, and you tell me if it makes sense to you.
Consider cellular automata. There’s no general concept of causality for CA because some of them are reversible and can be computed in either direction. But you can still build a computer inside a CA and write a program for it. The program will output instructions for some robot arms inside the CA to optimize some utility function on the CA’s states. Let’s also assume that the initial state of the CA can contain multiple computers running the program, with different architectures etc. A complete description of the initial state will be given to the program at startup, so there’s no uncertainty anywhere in the setup.
Now the question is, what’s the most general way to write such programs, for different cellular automata and utility functions? It seems to me that if you try to answer that question, you’ll first stumble on the idea of giving the program a quined description of itself, so it can find instances of itself inside the CA. Then you’ll get the idea of using something like “logical consequences” of different possible outputs, because physical consequences aren’t available. Then you’ll notice that provability in a formal theory is one possible way to formalize “logical consequences”, though it has many problems. And eventually you’ll come up with a version of UDT which might look something like this, or possibly this if you’re more concerned with provable optimality than computability.
What are the best arguments for/against some of MIRI’s core positions.
We should worry about tool AI.
-Tool AI and oracle AI are different. Oracles are agents in a box. Tools are not agents, so they can’t take actions in the world or optimize an unfriendly utility function any more google maps optimizes a utility function. Why not just tell the AI to figure out physics/math/CS?
If it is an agent, why doesn’t Intelligence imply benevolence?
-emotions (like happiness/sadness) are vague concepts in the same way that objects are fuzzy concepts (think of invariant representations of faces). So, if an agent is intelligent enough to recognize fuzzy objects, shouldn’t it also accurately recognize fuzzy emotions (and realize when it’s doing something stupid like making ‘happy’ paperclips).
Your first point was discussed in detail here. Your second point was discussed in many places on LW, most recently here, I think.
Thanks! I’d already read the first link, and remember thinking that it needed to be argued better. Mainly, I still think people conflate tools with agents in a box. It seems obvious (in principle) that you could build an AI that doesn’t do anything but Math/CS/Physics, and doesn’t even know humans exist.
I’m planning on writing up my disagreements more formally. But first, I’m waiting on getting a copy of Nick Bostrom’s new book, so that I can be certain that I’m critiquing the strongest arguments.
I hadn’t seen the second link. I’ll definitely have to give it a more thorough read-through later.