I think it depends on how much you’re willing to ask counterfactuals to do.
In the paper Anthropic Decision Theory for Self-Locating Agents, Stuart Armstrong says “ADT is nothing but the anthropic version of the far more general Updateless Decision Theory and Functional Decision Theory”—suggesting that he agrees with the idea that a proposed solution to counterfactual reasoning gives a proposed solution to anthropic reasoning. The overall approach of that paper is to side-step the issue of assigning anthropic probabilities, instead addressing the question of how to make decisions in cases where anthropic questions arise. I suppose this might either be said to “solves anthropics” or “side-steps anthropics”, and this choice would determine whether one took Stuart’s view to answer “yes” or “no” to your question.
Stuart mentions in that paper that agents making decisions via CDT+SIA tend to behave the same as agents making decisions via EDT+SSA. This can be seen formally in Jessica Taylor’s post about CDT+SIA in memoryless cartesian environments, and Caspar Oesterheld’s comment about the parallel for EDT+SSA. The post discusses the close connection to pure UDT (with no special anthropic reasoning). Specifically, CDT+SIA (and EDT+SSA) are consistent with the optimality notion of UDT, but don’t imply it (UDT may do better, according to its own notion of optimality). This is because UDT (specifically, UDT 1.1) looks for the best solution globally, whereas CDT+SIA can have self-coordination problems (like hunting rabbit in a game of stag hunt with identical copies of itself).
You could see this as giving a relationship between two different notions of counterfactual, with anthropic reasoning mediating the connection.
CDT and EDT are two different ways of reasoning about the consequences of actions. Both of them are “updateful”: they make use of all information available in estimating the consequences of actions. We can also think of them as “local”: they make decisions from the situated perspective of an information state, whereas UDT makes decisions from a “global” perspective considering all possible information states.
I would claim that global counterfactuals have an easier job than local ones, if we buy the connection between the two suggested here. Consider the transparent Newcomb problem: you’re offered a very large pile of money if and only if you’re the sort of agent who takes most, but not all, of the pile. It is easy to say from an updateless (global) perspective that you should be the sort of agent who takes most of the money. It is more difficult to face the large pile (an updateful/local perspective) and reason that it is best to take most-but-not-all; your counterfactuals have to say that taking all the money doesn’t mean you get all the money. The idea is that you have to be skeptical of whether you’re in a simulation; ie, your counterfactuals have to do anthropic reasoning.
In other words: you could factor the whole problem of logical decision theory in two different ways.
Option 1:
Find a good logically updateless perspective, providing the ‘global’ view from which we can make decisions.
Find a notion of logical counterfactual which combines with the above to yield decisions.
Option 2:
Find an updateful but skeptical perspective, which takes (logical) observations into account, but also accounts for the possibility that it is in a simulation and being fooled about those observations.
Find a notion of counterfactual which works with the above to make good decisions.
Also, somehow solve the coordination problems (which otherwise make option 1 look superior).
With option 1, you side-step anthropic reasoning. With option 2, you have to tackle it explicitly. So, you could say that in option 1, you solve anthropic reasoning for free if you solve counterfactual reasoning; in option 2, it’s quite the opposite: you might solve counterfactual reasoning by solving anthropic reasoning.
I’m more optimistic about option 2, recently. I used to think that maybe we could settle for the most basic possible notion of logical counterfactual, ie, evidential conditionals, if combined with logical updatelessness. However, a good logically updateless perspective has proved quite elusive so far.
Anyway, this answers one of key questions: whether it is worth working on anthropics or not. I put some time into reading about it (hopefully I get time to pick up Bostrom’s book again at some point), but I got discouraged when I started wondering if the work on logical counterfactuals would make this all irrelevant. Thanks for clarifying this. Anyway, why do you think the second approach is more promising?
“The idea is that you have to be skeptical of whether you’re in a simulation”—I’m not a big fan of that framing, though I suppose it’s okay if you’re clear that it is an analogy. Firstly, I think it is cleaner to seperate issues about whether simulations have consciousness or not from questions of decision theory given that functionalism is quite a controversial philosophical assumption (even though it might be taken for granted at MIRI). Secondly, it seems as though that you might be able to perfectly predict someone from high level properties without simulating them sufficiently to instantiate a consciousness. Thirdly, there isn’t necessarily a one-to-one relationship between “real” world runs and simulations. We only need to simulate an agent once in order to predict the result of any number of identical runs. So if the predictor only ever makes one prediction, but there’s a million clones all playing Newcomb’s the chance that someone in that subjective situation is the single simulation is vanishingly small.
So in so far as we talk about, “you could be in a simulation”, I’d prefer to see this as a pretense or a trick or analogy.
(Definitely not reporting MIRI consensus here, just my own views:) I find it appealing to collapse the analogy and consider the DT considerations to be really touching on the anthropic considerations. It isn’t just functionalism with respect to questions about other brains (such as their consciousness); it’s also what one might call cognitive functionalism—ie functionalism with respect to the map as opposed to the territory (my mind considering questions such as consciousness). What I mean is: if the decision-theoretic questions were isomorphic to the anthropic questions, serving the same sort of role in decision-making, then if I were to construct a mind thinking about one or the other, and ask it about what it is thinking, then there wouldn’t be any questions which would differentiate anthropic reasoning from the analogous decision-theoretic reasoning. This would seem like a quite strong argument in favor of discarding the distinction.
I’m not saying that’s the situation (we would need to agree, individually, on seperate settled solutions to both anthropics and decision theory in order to compare them side by side in that way). I’m saying that things seem to point in that direction.
It seems rather analogous to thinking that logic and mathematics are distinct (logical knowledge encoding tautology only, mathematical knowledge encoding a priori analytic knowledge, which one could consider distinct… I’m just throwing out philosophy words here to try to bolster the plausibility of this hypothetical view) -- and then discovering that within the realm of what you considered to be pure logic, there’s a structure which is isomorphic to the natural numbers, with all reasoning about the natural numbers being explainable as purely logical reasoning. It would be possible to insist on maintaining the distinction between the mathematical numbers and the logical structure which is analogous to them, referring to the first as analytic a priori knowledge and the second as tautology. However, one naturally begins to question the mathematical/logical distinction which was previously made. Was the notion of “logical” too broad? (Is higher-order logic really a part of mathematics?) Is number theory really a part of logic, rather than mathematics proper? What other branches of mathematics can be seen as structures within logic? Perhaps all mathematics is tautology, as Wittgenstein had it?
This position certainly has some counterintuitive consequences, which should be controversial. From a decision-theoretic perspective, it is practical to regard any means of predicting you which has equivalent predictive power to be equally “an instance of you” and hence equally conscious: a physics-style attempt to simulate you, or a logic-style attempt to reason about what you would do.
As for the question of simulating a person once to predict them a hundred times, the math all works out nicely if you look at Jessica Taylor’s post on the memoryless cartesian setting. A subjectively small chance of being the one simulation when a million meat copies are playing Newcomb’s will suffice for decision-theoretic purposes. How exactly everything works out depends on the details of formalizing the problem in the memoryless cartesian setting, but the theorem guarantees that everything balances. (I find this fact surprising and somewhat counterintuitive, but working out some examples myself helped me.)
I’m confused. This comment is saying that there isn’t a strict divide between decision theory and anthropics, but I don’t see how that has any relevance to the point that I raised in the comment it is responding to (that a perfect predictor need not utilise a simulation that is conscious in any sense of the word).
Maybe I’m confused about the relevance of your original comment to my answer. I interpreted
Firstly, I think it is cleaner to seperate issues about whether simulations have consciousness or not from questions of decision theory
as being about the relationship I outline between anthropics and decision theory—ie, anthropic reasoning may want to take consciousness into account (while you might think it plausible you’re in a physics simulation, it is plausible to hold that you can’t be living in a more general type of model which predicts you by reasoning about you rather than simulating you if the model is not detailed enough to support consciousness) whereas decision theory only takes logical control into account (so the relevant question is not whether a model is detailed enough to be conscious, but rather, whether it is detailed enough to create a logical dependence on your behavior).
I took
“The idea is that you have to be skeptical of whether you’re in a simulation”—I’m not a big fan of that framing,
as an objection to the connection I drew between thinking you might be in a simulation (ie, the anthropic question) and decision theory. Maybe you were objecting to the connection between thinking your in a simulation and anthropics? If so, the claimed connection to decision theory is still relevant. If you buy the decision theory connection, it seems hard to not buy the connection between thinking you’re in a simulation and anthropics.
I took
it seems as though that you might be able to perfectly predict someone from high level properties without simulating them sufficiently to instantiate a consciousness
to be an attempt to drive a wedge between anthropics and decision theory, by saying that a prediction might introduce a logical correlation without introducing an anthropic question of whether you might be ‘living in’ the prediction. To which my response was, I may want to bite the bullet on that one for the elegance of treating anthropic questions as decision-theoretic in nature.
I took
there isn’t necessarily a one-to-one relationship between “real” world runs and simulations. We only need to simulate an agent once in order to predict the result of any number of identical runs.
to be an attempt to drive a wedge between decision-theoretic and anthropic cases by the fact that we need to assign a small anthropic probability to being the simulation if it is only run once, to which I responded by saying that the math will work out the same on the decision theory side according to Jessica Taylor’s theorem.
My interpretation now is that you were never objecting to the connection between decision theory and anthropics, but rather to the way I was talking about anthropics. If so, my response is that the way I was talking about anthropics is essentially forced by the connection to decision theory.
For the first point, I meant that in order to consider this purely as a decision theory problem without creating a dependency on a particular theory of consciousness, you would ideally want a general theory that can deal with any criteria of consciousness (including just being handed a list of entities that count as conscious).
Regarding the second, when you update your decision algorithm, you have to update everything subjunctively dependent on you regardless of whether they are are agent or not, but that is distinct from “you could be that object”.
On the third, biting the bullet isn’t necessary to turn this into a decision theory problem as I mention in my response to the first point. But further, elegance alone doesn’t seem to be a good reason to accept a theory. I feel I might be misunderstanding your reasoning for biting this bullet.
I haven’t had time to read Jessica Taylor’s theorem yet, so I have no comment on the forth.
Second point: the motto is something like “anything which is dependent on you must have you inside its computation”. Something which depends on you because it is causally downstream of you contains you in its computation in the sense that you have to be calculated in the course of calculating it because you’re in its past. The claim is that this observation generalizes.
This seems like a motte and bailey. There’s a weak sense of “must have you inside its computation” which you’ve defined here and a strong sense as in “should be treated as containing as consciousness”.
Well, in any case, the claim I’m raising for consideration is that these two may turn out to be the same. The argument for the claim is the simplicity of merging the decision theory phenomenon with the anthropic phenomenon.
I note that I’m still overall confused about what the miscommunication was. Your response now seems to fit my earlier interpretation.
First point: I disagree about how to consider things as pure decision theory problems. Taking as input a list of conscious entities seems like a rather large point against a decision theory, since it makes it dependent on a theory of consciousness. If you want to be independent of questions like that, far better to consider decision theory on its own (thinking only in terms of logical control, counterfactuals, etc), and remain agnostic on the question of a connection between anthropics and decision theory.
In my analogy to mathematics, it could be that there’s a lot of philosophical baggage on the logic side and also a lot of philosophical baggage on the mathematical side. Claiming that all of math is tautology could create a lot of friction between these two sets of baggage, meaning one has to bite a lot of bullets which other people wouldn’t consider biting. This can be a good thing: you’re allowing more evidence to flow, pinning down your views on both sides more strongly. In addition to simplicity, that’s related to a theory doing more to stick its neck out, making bolder predictions. To me, when this sort of thing happens, the objections to adopting the simpler view have to be actually quite strong.
I suppose that addresses your third point to an extent. I could probably give some reasons besides simplicity, but it seems to me that simplicity is a major consideration here, perhaps my true reason. I suspect we don’t actually disagree that much about whether simplicity should be a major consideration (unless you disagree about the weight of Occam’s razor, which would surprise me). I suspect we disagree about the cost of biting this particular bullet.
decision theory only takes logical control (so the relevant question is not whether a model is detailed enough to be conscious, but rather, whether it is detailed enough to create a logical dependence on your behavior).
Which I interpreted to be you talking about avoiding the issue of consciousness by acting as though any process logically dependent on you automatically “could be you” for the purpose of anthropics. I’ll call this the Reductive Approach.
However, when I said:
Firstly, I think it is cleaner to seperate issues about whether simulations have consciousness or not from questions of decision theory
I was thinking about separating these issues, not by using the Reductive Approach, but by using what I’ll call the Abstracting Approach. In this approach, you construct a theory of anthropics that is just handed a criteria of which beings are conscious and it is expected to be able to handle any such criteria.
Taking as input a list of conscious entities seems like a rather large point against a decision theory, since it makes it dependent on a theory of consciousness
Part of the confusion here is that we are using the word “depends” in different ways. When I said that the Abstracting Approach avoided creating a dependency on a theory of consciousness, I meant that if you follow this approach, you end up with a decision theory which can have any theory of consciousness just substituted in. It doesn’t depend on these theories, as if you discover your theory of consciousness is wrong, you just throw in a new one and everything works.
When you talk about “depends” and say that this is a disadvantage, you mean that in order to obtain a complete theory of anthropics, you need to select a theory of consciousness to be combined with your decision theory. I think that this is actually unfair, because in the Reductive Approach, you do implicitly select a theory of consciousness, which I’ll call Naive Functionalism. I’m not using this name to be pejorative, it’s the best descriptor I can think of for the version of functionalism which you are using that ignores any concerns that high-level predictors might not deserve to be labelled as a consciousness.
With the Abstracting Approach I still maintain the option of assuming Naive Functionalism, in which case it collapses down to the Reductive Approach. So given these assumptions, both approaches end up being equally simple. In contrast, given any other theory of consciousness, the Reductive Approach complains that you are outside its assumptions, while the Abstracting Approach works just fine. The mistake here was attempting to compare the simplicity of two different theories directly without adjusting for them having different scopes.
“I suspect we don’t actually disagree that much about whether simplicity should be a major consideration”—I’m not objecting to simplicity as a consideration. My argument is that Occams’ razor is about accepting the simplest theory that is consistent with the situation. In my mind it seems like you are allowing simplicity to let you ignore the fact that your theory is inconsistent with the situation, which is not how I believe Occam’s Razor is suppose to work. So it’s not just about the cost, but about whether this is even a sensible way of reasoning.
I agree that we are using “depends” in different ways. I’ll try to avoid that language. I don’t think I was confusing the two different notions when I wrote my reply; I thought, and still think, that taking the abstraction approach wrt consciousness is in itself a serious point against a decision theory. I don’t think the abstraction approach is always bad—I think there’s something specific about consciousness which makes it a bad idea.
Actually, that’s too strong. I think taking the abstraction approach wrt consciousness is satisfactory if you’re not trying to solve the problem of logical counterfactuals or related issues. There’s something I find specifically worrying here.
I think part of it is, I can’t imagine what else would settle the question. Accepting the connection to decision theory lets me pin down what should count as an anthropic instance (to the extent that I can pin down counterfactuals). Without this connection, we seem to risk keeping the matter afloat forever.
Making a theory of counterfactuals take an arbitrary theory of consciousness as an argument seems to cement this free-floating idea of consciousness, as an arbitrary property which a lump of matter can freely have or not have. My intuition that decision theory has to take a stance here is connected to an intuition that a decision theory needs to depend on certain ‘sensible’ aspects of a situation, and is not allowed to depend on ‘absurd’ aspects. For example, the table being wood vs metal should be an inessential detail of the 5&10 problem.
This isn’t meant to be an argument, only an articulation of my position. Indeed, my notion of “essential” vs “inessential” details is overtly functionalist (eg, replacing carbon with silicon should not matter if the high-level picture of the situation is untouched).
Still, I think our disagreement is not so large. I agree with you that the question is far from obvious. I find my view on anthropics actually fairly plausible, but far from determined.
When you talk about “depends” and say that this is a disadvantage, you mean that in order to obtain a complete theory of anthropics, you need to select a theory of consciousness to be combined with your decision theory. I think that this is actually unfair, because in the Reductive Approach, you do implicitly select a theory of consciousness, which I’ll call Naive Functionalism. I’m not using this name to be pejorative, it’s the best descriptor I can think of for the version of functionalism which you are using that ignores any concerns that high-level predictors might deserve to be labelled as a consciousness.
“Naive” seems fine here; I’d agree that the position I’m describing is of a “the most naive view here turns out to be true” flavor (so long as we don’t think of “naive” as “man-on-the-street”/”folk wisdom”).
I don’t think it is unfair of me to select a theory of consciousness here while accusing you of requiring one. My whole point is that it is simpler to select the theory of consciousness which requires no extra ontology beyond what decision theory already needs for other reasons. It is less simple if we use some extra stuff in addition. It is true that I’ve also selected a theory of consciousness, but the way I’ve done so doesn’t incur an extra complexity penalty, whereas you might, if you end up going with something else than I do.
My argument is that Occams’ razor is about accepting the simplest theory that is consistent with the situation. In my mind it seems like you are allowing simplicity to let you ignore the fact that your theory is inconsistent with the situation, which is not how I believe Occam’s Razor is suppose to work. So it’s not just about the cost, but about whether this is even a sensible way of reasoning.
We agree that Occam’s razor is about accepting the simplest theory that is consistent with the situation. We disagree about whether the theory is inconsistent with the situation.
What is the claimed inconsistency? So far my perception of your argument has been that you insist we could make a distinction. When you described your abstraction approach, you said that we could well choose naive functionalism as our theory of consciousness.
Making a theory of counterfactuals take an arbitrary theory of consciousness as an argument seems to cement this free-floating idea of consciousness, as an arbitrary property which a lump of matter can freely have or not have
The argument that you’re making isn’t that the Abstraction Approach is wrong, it’s that by supporting other theories of consciousness, it increases the chance that people will mistakenly fail to choose Naive Functionalism. Wrong theories do tend to attract a certain number of people believing in them, but I would like to think that the best theory is likely to win out over time on Less Wrong.
And there’s a cost to this. If we remove the assumption of a particular theory of consciousness, then more people will be able to embrace the theories of anthropics that are produced. And partial agreement is generally better than none.
My whole point is that it is simpler to select the theory of consciousness which requires no extra ontology beyond what decision theory already needs for other reasons
This is an argument for Naive Functionalism vs other theories of consciousness. It isn’t an argument for the Abstracting Approach over the Reductive approach. The Abstracting Approach is more complicated, but it also seeks to do more. In order to fairly compare them, you have to compare both on the same domain. And given the assumption of Naive Functionalism, the Abstracting Approach reduces to the Reductive Approach.
What is the claimed inconsistency?
I provided reasons why I believe that Naive Functionalism is implausible in an earlier comment. I’ll admit that inconsistency is too strong of a word. My point is just that you need an independent reason to bite the bullet other than simplicity. Like simplicity combined with reasons why the bullets sound worse than they actually are.
When you described your abstraction approach, you said that we could well choose naive functionalism as our theory of consciousness.
Yes. It works with any theory of consciousness, even clearly absurd ones.
The argument that you’re making isn’t that the Abstraction Approach is wrong, it’s that by supporting other theories of consciousness, it increases the chance that people will mistakenly fail to choose Naive Functionalism. Wrong theories do tend to attract a certain number of people believing in them, but I would like to think that the best theory is likely to win out over time on Less Wrong.
(I note that I flagged this part as not being an argument, but rather an attempt to articulate a hazy intuition—I’m trying to engage with you less as an attempt to convince, more to explain how I see the situation.)
I don’t think that’s quite the argument I want to make. The problem isn’t that it gives people the option of making the wrong choice. The problem is that it introduces freedom in a suspicious place.
Here’s a programming analogy:
Both of us are thinking about how to write a decision theory library. We have a variety of confusions about this, such as what functionality a decision theory library actually needs to support, what the interface it needs to present to other things is, etc. Currently, we are having a disagreement about whether it should call an external library for ‘consciounsens’ vs implement its own behavior. You are saying that we don’t want to commit to implementing consciousness a particular way, because we may find that we have to change that later. So, we need to write the library in a way such that we can easily swap consciousness libraries.
When I imagine trying to write the code, I don’t see how I’m going to call the ‘consciousness’ library while solving all the other problems I need to solve. It’s not that I want to write my own ‘consciousness’ functionality. It’s that I don’t think ‘consciousness’ is an abstraction that’s going to play well with the sort of things I need to do. So when I’m trying to resolve other confusions (about the interface, data types I will need, functionality which I may want to implement, etc) I don’t want to have to think about calling arbitrary consciousness libraries. I want to think about the data structures and manipulations which feel natural to the problem being solved. If this ends up generating some behaviors which look like a call to the ‘naive functionalism’ library, this makes me think the people who wrote that library maybe were on to something, but it doesn’t make me any more inclined to re-write my code in a way which can call ‘consciousness’ libraries.
If another programmer sketches a design for a decision theory library which can call a given consciousness library, I’m going to be a bit skeptical and ask for more detail about how it gets called and how the rest of the library is factored such that it isn’t just doing a bunch of work in two different ways or something like that.
Actually, I’m confused about how we got here. It seems like you were objecting to the (reductive-as-opposed-to-merely-analogical version of the) connection I’m drawing between decision theory and anthropics. But then we started discussing the question of whether a (logical) decision theory should be agnostic about consciousness vs take a position. This seems to be a related but separate question; if you reject (or hold off on deciding) the connection between decision theory and anthropics, a decision theory may or may not have to take a position on consciousness for other reasons. It’s also not entirely clear that you have to take a particular position on consciousness if you buy the dt-anthropics connection. I’ve actually been ignoring the question of ‘consciousness’ in itself, and instead mentally substituting it with ‘anthropic instance-ness’. I’m not sure what I would want to say about consciousness proper; it’s a very complicated topic.
This is an argument for Naive Functionalism vs other theories of consciousness. It isn’t an argument for the Abstracting Approach over the Reductive approach. The Abstracting Approach is more complicated, but it also seeks to do more. In order to fairly compare them, you have to compare both on the same domain. And given the assumption of Naive Functionalism, the Abstracting Approach reduces to the Reductive Approach.
An argument in favor of naive functionalism makes applying the abstraction approach less appealing, since it suggests the abstraction is only opening the doors to worse theories. I might be missing something about what you’re saying here, but I think you are not only arguing that you can abstract without losing anything (because the agnosticism can later be resolved to naive functionalism), but that you strongly prefer to abstract in this case.
But, I agree that that’s not the primary disagreement between us. I’m fine with being agnostic about naive functionalism; I think of myself as agnostic, merely finding it appealing. Primarily I’m reacting to the abstraction approach, because I think it is better in this case for a theory of logical counterfactuals to take a stand on anthropics. The fact that I’m uncertain about naive functionalism is tied to the fact that I’m uncertain about counterfactuals; the structure of my uncertainty is such that I expect information about one to provide information about the other. You want to maintain agnosticism about consciousness, and as a result, you don’t want to tie those beliefs together in that way. From my perspective, it seems better to maintain that agnosticism (if desired) by remaining agnostic about the specific connection between anthropics and decision theory which I outlined, rather than by trying to do decision theory in a way which is agnostic about anthropics in general.
Both of us are thinking about how to write a decision theory library.
That makes your position a lot clearer. I admit that the Abstraction Approach makes things more complicated and that this might affect what you can accomplish either theoretically or practically by using the Reductive Approach, so I could see some value in exploring this path. For Stuart Armstrong’s paper in particular, the Abstraction Approach wouldn’t really add much in the way of complications and it would make it much clearer what was going on. But maybe there are other things you are looking into where it wouldn’t be anywhere near this easy. But in any case, I’d prefer people to use the Abstraction Approach in the cases where it is easy to do so.
An argument in favor of naive functionalism makes applying the abstraction approach less appealing
True, and I can imagine a level of likelihood below which adopting the Abstraction Approach would be adding needless complexity and mostly be a waste of time.
I think it is worth making a distinction between complexity in the practical sense and complexity in the hypothetical sense. In the practical sense, using the Abstraction Approach with Naive Functionalism is more complex than the Reductive Approach. In the hypothetical sense, they are equally complex in term of explaining how anthropics works given Naive Functionalism as we haven’t postulated anything additional within this particular domain (you may say that we’ve postulated consciousness, but within this assumption it’s just a renaming of a term, rather than the introduction of an extra entity). I believe that Occam’s Razor should be concerned with the later type of complexity, which is why I wouldn’t consider it a good argument for the Reductive Approach.
But that you strongly prefer to abstract in this case
I’m very negative on Naive Functionalism. I’ve still got some skepticism about functionalism itself (property dualism isn’t implausible in my mind), but if I had to choose between Functionalist theories, that certainly isn’t what I’d pick.
I’m very negative on Naive Functionalism. I’ve still got some skepticism about functionalism itself (property dualism isn’t implausible in my mind), but if I had to choose between Functionalist theories, that certainly isn’t what I’d pick.
I’m trying to think more about why I feel this outcome is a somewhat plausible one. The thing I’m generating is a feeling that this is ‘how these things go’—that the sign that you’re on the right track is when all the concepts start fitting together like legos.
I guess I also find it kind of curious that you aren’t more compelled by the argument I made early on, namely, that we should collapse apparently distinct notions if we can’t give any cognitive difference between them. I think I later rounded down this argument to occam’s razor, but there’s a different point to be made: if we’re talking about the cognitive role played by something, rather than just the definition (as is the case in decision theory), and we can’t find a difference in cognitive role (even if we generally make a distinction when making definitions), it seems hard to sustain the distinction. Taking another example related to anthropics, it seems hard to sustain a distinction between ‘probability that I’m an instance’ and ‘degree I care about each instance’ (what’s been called a ‘caring measure’ I think), when all the calculations come out the same either way, even generating something which looks like a Bayesian update of the caring measure. Initially it seems like there’s a big difference, because it’s a question of modeling something as a belief or a value; but, unless some substantive difference in the actual computations presents itself, it seems the distinction isn’t real. A robot built to think with true anthropic uncertainty vs caring measures is literally running equivalent code either way; it’s effectively only a difference in code comment.
“Namely, that we should collapse apparently distinct notions if we can’t give any cognitive difference between them”—I don’t necessarily agree that being subjunctively linked to you (such that it gives the same result) is the same as being cognitively identical, so this argument doesn’t get off the ground for me. If adopt a functionalist theory, it seems quite plausible that the degree of complexity is important too (although perhaps you’d say that isn’t pure functionalism?)
It might be helpful to relate this to the argument I made in Logical Counterfactuals and the Cooperation Game. The point I make there is that the processes are subjunctively linked to you is more a matter of your state of knowledge than anything about the intrinsic properties of the object itself. So if you adopt the position that things that are subjunctively linked to you are cognitively and hence consciously the same, you end up with a highly relativistic viewpoint.
I’m curious, how much do people at MIRI lean towards naive functionalism? I’m mainly asking because I’m trying to figure out whether there’s a need to write a post arguing against this.
I haven’t heard anyone else express the extremely naive view we’re talking about that I recall, and I probably have some specific decision-theory-related beliefs that make it particularly appealing to me, but I don’t think it’s out of the ballpark of other people’s views so to speak.
The point I make there is that the processes are subjunctively linked to you is more a matter of your state of knowledge than anything about the intrinsic properties of the object itself.
I (probably) agree with this point, and it doesn’t seem like much of an argument against the whole position to me—coming from a Bayesian background, it makes sense to be subjectivist about a lot of things, and link them to your state of knowledge. I’m curious how you would complete the argument—OK, subjunctive statements are linked to subjective states of knowledge. Where does that speak against the naive functionalist position?
“OK, subjunctive statements are linked to subjective states of knowledge. Where does that speak against the naive functionalist position?”—Actually, what I said about relativism isn’t necessarily true. You could assert that any process that is subjunctively linked to what is generally accepted to be a consciousness from any possible reference frame is cognitively identical and hence experiences the same consciousness. But that would include a ridiculous number of things.
By telling you that a box will give the same output as you, we can subjunctively link it to you, even if it is only either a dumb box that immediately outputs true or a dumb box that immediately outputs false. Further, there is no reason why we can’t subjunctively link someone else facing a completely different situation to the same black box, since the box doesn’t actually need to receive the same input as you to be subjunctively linked (this idea is new, I didn’t actually realise that before). So the box would be having the experiences of two people at the same time. This feels like a worse bullet than the one you already want to bite.
The box itself isn’t necessarily thought of as possessing an instance of my consciousness. The bullet I want to bite is the weaker claim that anything subjunctively linked to me has me somewhere in its computation (including its past). In the same way that a transcript of a conversation I had contains me in its computation (I had to speak a word in order for it to end up in the text) but isn’t itself conscious, a box which very reliably has the same output as me must be related to me somehow.
I anticipate that your response is going to be “but what if it is only a little correlated with you?”, to which I would reply “how do we set up the situation?” and probably make a bunch of “you can’t reliably put me into that epistemic state” type objections. In other words, I don’t expect you to be able to make a situation where I both assent to the subjective subjunctive dependence and will want to deny that the box has me somewhere in its computation.
For example, the easiest way to make the correlation weak is for the predictor who tells me the box has the same output as me to be only moderately good. There are several possibilities. (1) I can already predict what the predictor will think I’ll do, which screens off its prediction from my action, so no subjective correlation; (2) I can’t predict confidently what the predictor will say, which means the predictor has information about my action which I lack; then, even if the predictor is poor, it must have a significant tie to me; for example, it might have observed me making similar decisions in the past. So there are copies of me behind the correlation.
“The bullet I want to bite is the weaker claim that anything subjunctively linked to me has me somewhere in its computation (including its past)”—That doesn’t describe this example. You are subjunctively linked to the dumb boxes, but they don’t have you in their past. The thing that has you in its past is the predictor.
I disagree, and I thought my objection was adequately explained. But I think my response will be more concrete/understandable/applicable if you first answer: how do you propose to reliably put an agent into the described situation?
The details of how you set up the scenario may be important to the analysis of the error in the agent’s reasoning. For example, if the agent just thinks the predictor is accurate for no reason, it could be that the agent just has a bad prior (the predictor doesn’t really reliably tell the truth about the agent’s actions being correlated with the box). To that case, I could respond that of course we can construct cases we intuitively disagree with by giving the agent a set of beliefs which we intuitively disagree with. (This is similar to my reason for rejecting the typical smoking lesion setup as a case against EDT! The beliefs given to the EDT agent in smoking lesion are inconsistent with the problem setup.)
I’m not suggesting that you were implying that, I’m just saying it to illustrate why it might be important for you to say more about the setup.
“How do you propose to reliably put an agent into the described situation?”—Why do we have to be able to reliably put an agent in that situation? Isn’t it enough that an agent may end up in that situation?
But in terms of how the agent can know the predictor is accurate, perhaps the agent gets to examine its source code after it has run and its implemented in hardware rather than software so that the agent knows that it wasn’t modified?
But I don’t know why you’re asking so I don’t know if this answers the relevant difficulty.
“How do you propose to reliably put an agent into the described situation?”—Why do we have to be able to reliably put an agent in that situation? Isn’t it enough that an agent may end up in that situation?
For example, we can describe how to put an agent into the counterfactual mugging scenario as normally described (where Omega asks for $10 and gives nothing in return), but critically for our analysis, one can only reliably do so by creating a significant chance that the agent ends up in the other branch (where Omega gives the agent a large sum if and only if Omega would have received the asked-for $10 in the other branch). If this were not the case, the argument for giving the $10 would seem weaker.
But in terms of how the agent can know the predictor is accurate, perhaps the agent gets to examine its source code after it has run and its implemented in hardware rather than software so that the agent knows that it wasn’t modified?
I’m asking for more detail about how the predictor is constructed such that the predictor can accurately point out that the agent has the same output as the box. Similarly to how counterfactual mugging would be less compelling if we had to rely on the agent happening to have the stated subjunctive dependencies rather than being able to describe a scenario in which it seems very reasonable for the agent to have those subjunctive dependencies, your example would be less compelling if the box just happens to contain a slip of paper with our exact actions, and the predictor just happens to guess this correctly, and we just happen to trust the predictor correctly. Then I would agree that something has gone wrong, but all that has gone wrong is that the agent had a poor picture of the world (one which is subjunctively incorrect from our perspective, even though it made correct predictions).
On the other hand, if the predictor runs a simulation of us, and then purposefully chose a box whose output is identical to ours, then the situation seems perfectly sensible: “the box” that’s correlated with our output subjectively is a box which is chosen differently in cases where our output is different; and, the choice-of-box contains a copy of us. So the example works: there is a copy of us somewhere in the computation which correlates with us.
I’ve read it now. I think you could already have guessed that I agree with the ‘subjective’ point and disagree with the ‘meaningless to consider the case where you have full knowledge’ point.
“”The box” that’s correlated with our output subjectively is a box which is chosen differently in cases where our output is different; and, the choice-of-box contains a copy of us. So the example works”—that’s a good point and if you examine the source code, you’ll know it was choosing between two boxes. Maybe we need an extra layer of indirection. There’s a Truth Tester who can verify that the Predictor is accurate by examining its source code and you only get to examine the Truth Tester’s code, so you never end up seeing the code within the predictor that handles the case where the box doesn’t have the same output as you. As far as you are subjectively concerned, that doesn’t happen.
Ok, so you find yourself in this situation where the Truth Tester has verified that the Predictor is accurate, and you’ve verified that the Truth Tester is accurate, and the Predictor tells you that the direction you’re about to turn your head has a perfect correspondence to the orbit of some particular asteroid. Lacking the orbit information yourself, you now have a subjective link between your next action and the asteroid’s path.
This case does appear to present some difficulty for me.
I think this case isn’t actually so different from the previous case, because although you don’t know the source code of the Predictor, you might reasonably suspect that the Predictor picks out an asteroid after predicting you (or, selects the equation relating your head movement to the asteroid orbit after picking out the asteroid). We might suspect this precisely because it is implausible that the asteroid is actually mirroring our computation in a more significant sense. So using a Truth Teller intermediary increases the uncertainty of the situation, but increased uncertainty is compatible with the same resolution.
What your revision does do, though, is highlight how the counterfactual expectation has to differ from the evidential conditional. We may think “the Predictor would have selected a different asteroid (or different equation) if its computation of our action had turned out different”, but, we now know the asteroid (and the equation); so, our evidential expectation is clearly that the asteroid has a different orbit depending on our choice of action. Yet, it seems like the sensible counterfactual expectation given the situation is … hm.
Actually, now I don’t think it’s quite that the evidential and counterfactual expectation come apart. Since you don’t know what you actually do yet, there’s no reason for you to tie any particular asteroid to any particular action. So, it’s not that in your state of uncertainty choice of action covaries with choice of asteroid (via some particular mapping). Rather, you suspect that there is such a mapping, whatever that means.
In any case, this difficulty was already present without the Truth Teller serving as intermediary: the Predictor’s choice of box is already known, so even though it is sensible to think of the chosen box as what counterfactually varies based on choice of action, on-the-spot what makes sense (evidentially) is to anticipate the same box having different contents.
So, the question is: what’s my naive functionalist position supposed to be? What sense of “varies with” is supposed to necessitate the presence of a copy of me in the (logico-)causal ancestry of an event?
It occurs to me that although I have made clear that I (1) favor naive functionalism and (2) am far from certain of it, I haven’t actually made clear that I further (3) know of no situation where I think the agent has a good picture of the world and where the agent’s picture leads it to conclude that there’s a logical correlation with its action which can’t be accounted for by a logical cause (ie something like a copy of the agent somewhere in the computation of the correlated thing). IE, if there are outright counterexamples to naive functionalism, I think they’re actually tricky to state, and I have at least considered a few cases—your attempted counterexample comes as no surprise to me and I suspect you’ll have to try significantly harder.
My uncertainty is, instead, in the large ambiguity of concepts like “instance of an agent” and “logical cause”.
I provided reasons why I believe that Naive Functionalism is implausible in an earlier comment. I’ll admit that inconsistency is too strong of a word. My point is just that you need an independent reason to bite the bullet other than simplicity. Like simplicity combined with reasons why the bullets sound worse than they actually are.
Ah, I had taken you to be asserting possibilities and a desire to keep those possibilities open rather than held views and a desire for theories to conform to those views.
Maybe something about my view which I should emphasize is that since it doesn’t nail down any particular notion of counterfactual dependence, it doesn’t actually directly bite bullets on specific examples. In a given case where it may seem initially like you want counterfactual dependence but you don’t want anthropic instances to live, you’re free to either change views on one or the other. It could be that a big chunk of our differing intuitions lies in this. I suspect you’ve been thinking of me as wanting to open up the set of anthropic instances much wider than you would want. But, my view is equally amenable to narrowing down the scope of counterfactual dependence, instead. I suspect I’m much more open to narrowing down counterfactual dependence than you might think.
I suspect you’ve been thinking of me as wanting to open up the set of anthropic instances much wider than you would want. But, my view is equally amenable to narrowing down the scope of counterfactual dependence, instead. I suspect I’m much more open to narrowing down counterfactual dependence than you might think.
Oh, I completely missed this. That said, I would be highly surprised if these notions were to coincide since they seem like different types. Something for me to think about.
I think it depends on how much you’re willing to ask counterfactuals to do.
In the paper Anthropic Decision Theory for Self-Locating Agents, Stuart Armstrong says “ADT is nothing but the anthropic version of the far more general Updateless Decision Theory and Functional Decision Theory”—suggesting that he agrees with the idea that a proposed solution to counterfactual reasoning gives a proposed solution to anthropic reasoning. The overall approach of that paper is to side-step the issue of assigning anthropic probabilities, instead addressing the question of how to make decisions in cases where anthropic questions arise. I suppose this might either be said to “solves anthropics” or “side-steps anthropics”, and this choice would determine whether one took Stuart’s view to answer “yes” or “no” to your question.
Stuart mentions in that paper that agents making decisions via CDT+SIA tend to behave the same as agents making decisions via EDT+SSA. This can be seen formally in Jessica Taylor’s post about CDT+SIA in memoryless cartesian environments, and Caspar Oesterheld’s comment about the parallel for EDT+SSA. The post discusses the close connection to pure UDT (with no special anthropic reasoning). Specifically, CDT+SIA (and EDT+SSA) are consistent with the optimality notion of UDT, but don’t imply it (UDT may do better, according to its own notion of optimality). This is because UDT (specifically, UDT 1.1) looks for the best solution globally, whereas CDT+SIA can have self-coordination problems (like hunting rabbit in a game of stag hunt with identical copies of itself).
You could see this as giving a relationship between two different notions of counterfactual, with anthropic reasoning mediating the connection.
CDT and EDT are two different ways of reasoning about the consequences of actions. Both of them are “updateful”: they make use of all information available in estimating the consequences of actions. We can also think of them as “local”: they make decisions from the situated perspective of an information state, whereas UDT makes decisions from a “global” perspective considering all possible information states.
I would claim that global counterfactuals have an easier job than local ones, if we buy the connection between the two suggested here. Consider the transparent Newcomb problem: you’re offered a very large pile of money if and only if you’re the sort of agent who takes most, but not all, of the pile. It is easy to say from an updateless (global) perspective that you should be the sort of agent who takes most of the money. It is more difficult to face the large pile (an updateful/local perspective) and reason that it is best to take most-but-not-all; your counterfactuals have to say that taking all the money doesn’t mean you get all the money. The idea is that you have to be skeptical of whether you’re in a simulation; ie, your counterfactuals have to do anthropic reasoning.
In other words: you could factor the whole problem of logical decision theory in two different ways.
Option 1:
Find a good logically updateless perspective, providing the ‘global’ view from which we can make decisions.
Find a notion of logical counterfactual which combines with the above to yield decisions.
Option 2:
Find an updateful but skeptical perspective, which takes (logical) observations into account, but also accounts for the possibility that it is in a simulation and being fooled about those observations.
Find a notion of counterfactual which works with the above to make good decisions.
Also, somehow solve the coordination problems (which otherwise make option 1 look superior).
With option 1, you side-step anthropic reasoning. With option 2, you have to tackle it explicitly. So, you could say that in option 1, you solve anthropic reasoning for free if you solve counterfactual reasoning; in option 2, it’s quite the opposite: you might solve counterfactual reasoning by solving anthropic reasoning.
I’m more optimistic about option 2, recently. I used to think that maybe we could settle for the most basic possible notion of logical counterfactual, ie, evidential conditionals, if combined with logical updatelessness. However, a good logically updateless perspective has proved quite elusive so far.
Anyway, this answers one of key questions: whether it is worth working on anthropics or not. I put some time into reading about it (hopefully I get time to pick up Bostrom’s book again at some point), but I got discouraged when I started wondering if the work on logical counterfactuals would make this all irrelevant. Thanks for clarifying this. Anyway, why do you think the second approach is more promising?
My thoughts on that are described further here.
“The idea is that you have to be skeptical of whether you’re in a simulation”—I’m not a big fan of that framing, though I suppose it’s okay if you’re clear that it is an analogy. Firstly, I think it is cleaner to seperate issues about whether simulations have consciousness or not from questions of decision theory given that functionalism is quite a controversial philosophical assumption (even though it might be taken for granted at MIRI). Secondly, it seems as though that you might be able to perfectly predict someone from high level properties without simulating them sufficiently to instantiate a consciousness. Thirdly, there isn’t necessarily a one-to-one relationship between “real” world runs and simulations. We only need to simulate an agent once in order to predict the result of any number of identical runs. So if the predictor only ever makes one prediction, but there’s a million clones all playing Newcomb’s the chance that someone in that subjective situation is the single simulation is vanishingly small.
So in so far as we talk about, “you could be in a simulation”, I’d prefer to see this as a pretense or a trick or analogy.
(Definitely not reporting MIRI consensus here, just my own views:) I find it appealing to collapse the analogy and consider the DT considerations to be really touching on the anthropic considerations. It isn’t just functionalism with respect to questions about other brains (such as their consciousness); it’s also what one might call cognitive functionalism—ie functionalism with respect to the map as opposed to the territory (my mind considering questions such as consciousness). What I mean is: if the decision-theoretic questions were isomorphic to the anthropic questions, serving the same sort of role in decision-making, then if I were to construct a mind thinking about one or the other, and ask it about what it is thinking, then there wouldn’t be any questions which would differentiate anthropic reasoning from the analogous decision-theoretic reasoning. This would seem like a quite strong argument in favor of discarding the distinction.
I’m not saying that’s the situation (we would need to agree, individually, on seperate settled solutions to both anthropics and decision theory in order to compare them side by side in that way). I’m saying that things seem to point in that direction.
It seems rather analogous to thinking that logic and mathematics are distinct (logical knowledge encoding tautology only, mathematical knowledge encoding a priori analytic knowledge, which one could consider distinct… I’m just throwing out philosophy words here to try to bolster the plausibility of this hypothetical view) -- and then discovering that within the realm of what you considered to be pure logic, there’s a structure which is isomorphic to the natural numbers, with all reasoning about the natural numbers being explainable as purely logical reasoning. It would be possible to insist on maintaining the distinction between the mathematical numbers and the logical structure which is analogous to them, referring to the first as analytic a priori knowledge and the second as tautology. However, one naturally begins to question the mathematical/logical distinction which was previously made. Was the notion of “logical” too broad? (Is higher-order logic really a part of mathematics?) Is number theory really a part of logic, rather than mathematics proper? What other branches of mathematics can be seen as structures within logic? Perhaps all mathematics is tautology, as Wittgenstein had it?
This position certainly has some counterintuitive consequences, which should be controversial. From a decision-theoretic perspective, it is practical to regard any means of predicting you which has equivalent predictive power to be equally “an instance of you” and hence equally conscious: a physics-style attempt to simulate you, or a logic-style attempt to reason about what you would do.
As for the question of simulating a person once to predict them a hundred times, the math all works out nicely if you look at Jessica Taylor’s post on the memoryless cartesian setting. A subjectively small chance of being the one simulation when a million meat copies are playing Newcomb’s will suffice for decision-theoretic purposes. How exactly everything works out depends on the details of formalizing the problem in the memoryless cartesian setting, but the theorem guarantees that everything balances. (I find this fact surprising and somewhat counterintuitive, but working out some examples myself helped me.)
I’m confused. This comment is saying that there isn’t a strict divide between decision theory and anthropics, but I don’t see how that has any relevance to the point that I raised in the comment it is responding to (that a perfect predictor need not utilise a simulation that is conscious in any sense of the word).
Maybe I’m confused about the relevance of your original comment to my answer. I interpreted
as being about the relationship I outline between anthropics and decision theory—ie, anthropic reasoning may want to take consciousness into account (while you might think it plausible you’re in a physics simulation, it is plausible to hold that you can’t be living in a more general type of model which predicts you by reasoning about you rather than simulating you if the model is not detailed enough to support consciousness) whereas decision theory only takes logical control into account (so the relevant question is not whether a model is detailed enough to be conscious, but rather, whether it is detailed enough to create a logical dependence on your behavior).
I took
as an objection to the connection I drew between thinking you might be in a simulation (ie, the anthropic question) and decision theory. Maybe you were objecting to the connection between thinking your in a simulation and anthropics? If so, the claimed connection to decision theory is still relevant. If you buy the decision theory connection, it seems hard to not buy the connection between thinking you’re in a simulation and anthropics.
I took
to be an attempt to drive a wedge between anthropics and decision theory, by saying that a prediction might introduce a logical correlation without introducing an anthropic question of whether you might be ‘living in’ the prediction. To which my response was, I may want to bite the bullet on that one for the elegance of treating anthropic questions as decision-theoretic in nature.
I took
to be an attempt to drive a wedge between decision-theoretic and anthropic cases by the fact that we need to assign a small anthropic probability to being the simulation if it is only run once, to which I responded by saying that the math will work out the same on the decision theory side according to Jessica Taylor’s theorem.
My interpretation now is that you were never objecting to the connection between decision theory and anthropics, but rather to the way I was talking about anthropics. If so, my response is that the way I was talking about anthropics is essentially forced by the connection to decision theory.
For the first point, I meant that in order to consider this purely as a decision theory problem without creating a dependency on a particular theory of consciousness, you would ideally want a general theory that can deal with any criteria of consciousness (including just being handed a list of entities that count as conscious).
Regarding the second, when you update your decision algorithm, you have to update everything subjunctively dependent on you regardless of whether they are are agent or not, but that is distinct from “you could be that object”.
On the third, biting the bullet isn’t necessary to turn this into a decision theory problem as I mention in my response to the first point. But further, elegance alone doesn’t seem to be a good reason to accept a theory. I feel I might be misunderstanding your reasoning for biting this bullet.
I haven’t had time to read Jessica Taylor’s theorem yet, so I have no comment on the forth.
Second point: the motto is something like “anything which is dependent on you must have you inside its computation”. Something which depends on you because it is causally downstream of you contains you in its computation in the sense that you have to be calculated in the course of calculating it because you’re in its past. The claim is that this observation generalizes.
This seems like a motte and bailey. There’s a weak sense of “must have you inside its computation” which you’ve defined here and a strong sense as in “should be treated as containing as consciousness”.
Well, in any case, the claim I’m raising for consideration is that these two may turn out to be the same. The argument for the claim is the simplicity of merging the decision theory phenomenon with the anthropic phenomenon.
I note that I’m still overall confused about what the miscommunication was. Your response now seems to fit my earlier interpretation.
First point: I disagree about how to consider things as pure decision theory problems. Taking as input a list of conscious entities seems like a rather large point against a decision theory, since it makes it dependent on a theory of consciousness. If you want to be independent of questions like that, far better to consider decision theory on its own (thinking only in terms of logical control, counterfactuals, etc), and remain agnostic on the question of a connection between anthropics and decision theory.
In my analogy to mathematics, it could be that there’s a lot of philosophical baggage on the logic side and also a lot of philosophical baggage on the mathematical side. Claiming that all of math is tautology could create a lot of friction between these two sets of baggage, meaning one has to bite a lot of bullets which other people wouldn’t consider biting. This can be a good thing: you’re allowing more evidence to flow, pinning down your views on both sides more strongly. In addition to simplicity, that’s related to a theory doing more to stick its neck out, making bolder predictions. To me, when this sort of thing happens, the objections to adopting the simpler view have to be actually quite strong.
I suppose that addresses your third point to an extent. I could probably give some reasons besides simplicity, but it seems to me that simplicity is a major consideration here, perhaps my true reason. I suspect we don’t actually disagree that much about whether simplicity should be a major consideration (unless you disagree about the weight of Occam’s razor, which would surprise me). I suspect we disagree about the cost of biting this particular bullet.
You wrote:
Which I interpreted to be you talking about avoiding the issue of consciousness by acting as though any process logically dependent on you automatically “could be you” for the purpose of anthropics. I’ll call this the Reductive Approach.
However, when I said:
I was thinking about separating these issues, not by using the Reductive Approach, but by using what I’ll call the Abstracting Approach. In this approach, you construct a theory of anthropics that is just handed a criteria of which beings are conscious and it is expected to be able to handle any such criteria.
Part of the confusion here is that we are using the word “depends” in different ways. When I said that the Abstracting Approach avoided creating a dependency on a theory of consciousness, I meant that if you follow this approach, you end up with a decision theory which can have any theory of consciousness just substituted in. It doesn’t depend on these theories, as if you discover your theory of consciousness is wrong, you just throw in a new one and everything works.
When you talk about “depends” and say that this is a disadvantage, you mean that in order to obtain a complete theory of anthropics, you need to select a theory of consciousness to be combined with your decision theory. I think that this is actually unfair, because in the Reductive Approach, you do implicitly select a theory of consciousness, which I’ll call Naive Functionalism. I’m not using this name to be pejorative, it’s the best descriptor I can think of for the version of functionalism which you are using that ignores any concerns that high-level predictors might not deserve to be labelled as a consciousness.
With the Abstracting Approach I still maintain the option of assuming Naive Functionalism, in which case it collapses down to the Reductive Approach. So given these assumptions, both approaches end up being equally simple. In contrast, given any other theory of consciousness, the Reductive Approach complains that you are outside its assumptions, while the Abstracting Approach works just fine. The mistake here was attempting to compare the simplicity of two different theories directly without adjusting for them having different scopes.
“I suspect we don’t actually disagree that much about whether simplicity should be a major consideration”—I’m not objecting to simplicity as a consideration. My argument is that Occams’ razor is about accepting the simplest theory that is consistent with the situation. In my mind it seems like you are allowing simplicity to let you ignore the fact that your theory is inconsistent with the situation, which is not how I believe Occam’s Razor is suppose to work. So it’s not just about the cost, but about whether this is even a sensible way of reasoning.
I agree that we are using “depends” in different ways. I’ll try to avoid that language. I don’t think I was confusing the two different notions when I wrote my reply; I thought, and still think, that taking the abstraction approach wrt consciousness is in itself a serious point against a decision theory. I don’t think the abstraction approach is always bad—I think there’s something specific about consciousness which makes it a bad idea.
Actually, that’s too strong. I think taking the abstraction approach wrt consciousness is satisfactory if you’re not trying to solve the problem of logical counterfactuals or related issues. There’s something I find specifically worrying here.
I think part of it is, I can’t imagine what else would settle the question. Accepting the connection to decision theory lets me pin down what should count as an anthropic instance (to the extent that I can pin down counterfactuals). Without this connection, we seem to risk keeping the matter afloat forever.
Making a theory of counterfactuals take an arbitrary theory of consciousness as an argument seems to cement this free-floating idea of consciousness, as an arbitrary property which a lump of matter can freely have or not have. My intuition that decision theory has to take a stance here is connected to an intuition that a decision theory needs to depend on certain ‘sensible’ aspects of a situation, and is not allowed to depend on ‘absurd’ aspects. For example, the table being wood vs metal should be an inessential detail of the 5&10 problem.
This isn’t meant to be an argument, only an articulation of my position. Indeed, my notion of “essential” vs “inessential” details is overtly functionalist (eg, replacing carbon with silicon should not matter if the high-level picture of the situation is untouched).
Still, I think our disagreement is not so large. I agree with you that the question is far from obvious. I find my view on anthropics actually fairly plausible, but far from determined.
“Naive” seems fine here; I’d agree that the position I’m describing is of a “the most naive view here turns out to be true” flavor (so long as we don’t think of “naive” as “man-on-the-street”/”folk wisdom”).
I don’t think it is unfair of me to select a theory of consciousness here while accusing you of requiring one. My whole point is that it is simpler to select the theory of consciousness which requires no extra ontology beyond what decision theory already needs for other reasons. It is less simple if we use some extra stuff in addition. It is true that I’ve also selected a theory of consciousness, but the way I’ve done so doesn’t incur an extra complexity penalty, whereas you might, if you end up going with something else than I do.
We agree that Occam’s razor is about accepting the simplest theory that is consistent with the situation. We disagree about whether the theory is inconsistent with the situation.
What is the claimed inconsistency? So far my perception of your argument has been that you insist we could make a distinction. When you described your abstraction approach, you said that we could well choose naive functionalism as our theory of consciousness.
The argument that you’re making isn’t that the Abstraction Approach is wrong, it’s that by supporting other theories of consciousness, it increases the chance that people will mistakenly fail to choose Naive Functionalism. Wrong theories do tend to attract a certain number of people believing in them, but I would like to think that the best theory is likely to win out over time on Less Wrong.
And there’s a cost to this. If we remove the assumption of a particular theory of consciousness, then more people will be able to embrace the theories of anthropics that are produced. And partial agreement is generally better than none.
This is an argument for Naive Functionalism vs other theories of consciousness. It isn’t an argument for the Abstracting Approach over the Reductive approach. The Abstracting Approach is more complicated, but it also seeks to do more. In order to fairly compare them, you have to compare both on the same domain. And given the assumption of Naive Functionalism, the Abstracting Approach reduces to the Reductive Approach.
I provided reasons why I believe that Naive Functionalism is implausible in an earlier comment. I’ll admit that inconsistency is too strong of a word. My point is just that you need an independent reason to bite the bullet other than simplicity. Like simplicity combined with reasons why the bullets sound worse than they actually are.
Yes. It works with any theory of consciousness, even clearly absurd ones.
(I note that I flagged this part as not being an argument, but rather an attempt to articulate a hazy intuition—I’m trying to engage with you less as an attempt to convince, more to explain how I see the situation.)
I don’t think that’s quite the argument I want to make. The problem isn’t that it gives people the option of making the wrong choice. The problem is that it introduces freedom in a suspicious place.
Here’s a programming analogy:
Both of us are thinking about how to write a decision theory library. We have a variety of confusions about this, such as what functionality a decision theory library actually needs to support, what the interface it needs to present to other things is, etc. Currently, we are having a disagreement about whether it should call an external library for ‘consciounsens’ vs implement its own behavior. You are saying that we don’t want to commit to implementing consciousness a particular way, because we may find that we have to change that later. So, we need to write the library in a way such that we can easily swap consciousness libraries.
When I imagine trying to write the code, I don’t see how I’m going to call the ‘consciousness’ library while solving all the other problems I need to solve. It’s not that I want to write my own ‘consciousness’ functionality. It’s that I don’t think ‘consciousness’ is an abstraction that’s going to play well with the sort of things I need to do. So when I’m trying to resolve other confusions (about the interface, data types I will need, functionality which I may want to implement, etc) I don’t want to have to think about calling arbitrary consciousness libraries. I want to think about the data structures and manipulations which feel natural to the problem being solved. If this ends up generating some behaviors which look like a call to the ‘naive functionalism’ library, this makes me think the people who wrote that library maybe were on to something, but it doesn’t make me any more inclined to re-write my code in a way which can call ‘consciousness’ libraries.
If another programmer sketches a design for a decision theory library which can call a given consciousness library, I’m going to be a bit skeptical and ask for more detail about how it gets called and how the rest of the library is factored such that it isn’t just doing a bunch of work in two different ways or something like that.
Actually, I’m confused about how we got here. It seems like you were objecting to the (reductive-as-opposed-to-merely-analogical version of the) connection I’m drawing between decision theory and anthropics. But then we started discussing the question of whether a (logical) decision theory should be agnostic about consciousness vs take a position. This seems to be a related but separate question; if you reject (or hold off on deciding) the connection between decision theory and anthropics, a decision theory may or may not have to take a position on consciousness for other reasons. It’s also not entirely clear that you have to take a particular position on consciousness if you buy the dt-anthropics connection. I’ve actually been ignoring the question of ‘consciousness’ in itself, and instead mentally substituting it with ‘anthropic instance-ness’. I’m not sure what I would want to say about consciousness proper; it’s a very complicated topic.
An argument in favor of naive functionalism makes applying the abstraction approach less appealing, since it suggests the abstraction is only opening the doors to worse theories. I might be missing something about what you’re saying here, but I think you are not only arguing that you can abstract without losing anything (because the agnosticism can later be resolved to naive functionalism), but that you strongly prefer to abstract in this case.
But, I agree that that’s not the primary disagreement between us. I’m fine with being agnostic about naive functionalism; I think of myself as agnostic, merely finding it appealing. Primarily I’m reacting to the abstraction approach, because I think it is better in this case for a theory of logical counterfactuals to take a stand on anthropics. The fact that I’m uncertain about naive functionalism is tied to the fact that I’m uncertain about counterfactuals; the structure of my uncertainty is such that I expect information about one to provide information about the other. You want to maintain agnosticism about consciousness, and as a result, you don’t want to tie those beliefs together in that way. From my perspective, it seems better to maintain that agnosticism (if desired) by remaining agnostic about the specific connection between anthropics and decision theory which I outlined, rather than by trying to do decision theory in a way which is agnostic about anthropics in general.
That makes your position a lot clearer. I admit that the Abstraction Approach makes things more complicated and that this might affect what you can accomplish either theoretically or practically by using the Reductive Approach, so I could see some value in exploring this path. For Stuart Armstrong’s paper in particular, the Abstraction Approach wouldn’t really add much in the way of complications and it would make it much clearer what was going on. But maybe there are other things you are looking into where it wouldn’t be anywhere near this easy. But in any case, I’d prefer people to use the Abstraction Approach in the cases where it is easy to do so.
True, and I can imagine a level of likelihood below which adopting the Abstraction Approach would be adding needless complexity and mostly be a waste of time.
I think it is worth making a distinction between complexity in the practical sense and complexity in the hypothetical sense. In the practical sense, using the Abstraction Approach with Naive Functionalism is more complex than the Reductive Approach. In the hypothetical sense, they are equally complex in term of explaining how anthropics works given Naive Functionalism as we haven’t postulated anything additional within this particular domain (you may say that we’ve postulated consciousness, but within this assumption it’s just a renaming of a term, rather than the introduction of an extra entity). I believe that Occam’s Razor should be concerned with the later type of complexity, which is why I wouldn’t consider it a good argument for the Reductive Approach.
I’m very negative on Naive Functionalism. I’ve still got some skepticism about functionalism itself (property dualism isn’t implausible in my mind), but if I had to choose between Functionalist theories, that certainly isn’t what I’d pick.
I’m trying to think more about why I feel this outcome is a somewhat plausible one. The thing I’m generating is a feeling that this is ‘how these things go’—that the sign that you’re on the right track is when all the concepts start fitting together like legos.
I guess I also find it kind of curious that you aren’t more compelled by the argument I made early on, namely, that we should collapse apparently distinct notions if we can’t give any cognitive difference between them. I think I later rounded down this argument to occam’s razor, but there’s a different point to be made: if we’re talking about the cognitive role played by something, rather than just the definition (as is the case in decision theory), and we can’t find a difference in cognitive role (even if we generally make a distinction when making definitions), it seems hard to sustain the distinction. Taking another example related to anthropics, it seems hard to sustain a distinction between ‘probability that I’m an instance’ and ‘degree I care about each instance’ (what’s been called a ‘caring measure’ I think), when all the calculations come out the same either way, even generating something which looks like a Bayesian update of the caring measure. Initially it seems like there’s a big difference, because it’s a question of modeling something as a belief or a value; but, unless some substantive difference in the actual computations presents itself, it seems the distinction isn’t real. A robot built to think with true anthropic uncertainty vs caring measures is literally running equivalent code either way; it’s effectively only a difference in code comment.
“Namely, that we should collapse apparently distinct notions if we can’t give any cognitive difference between them”—I don’t necessarily agree that being subjunctively linked to you (such that it gives the same result) is the same as being cognitively identical, so this argument doesn’t get off the ground for me. If adopt a functionalist theory, it seems quite plausible that the degree of complexity is important too (although perhaps you’d say that isn’t pure functionalism?)
It might be helpful to relate this to the argument I made in Logical Counterfactuals and the Cooperation Game. The point I make there is that the processes are subjunctively linked to you is more a matter of your state of knowledge than anything about the intrinsic properties of the object itself. So if you adopt the position that things that are subjunctively linked to you are cognitively and hence consciously the same, you end up with a highly relativistic viewpoint.
I’m curious, how much do people at MIRI lean towards naive functionalism? I’m mainly asking because I’m trying to figure out whether there’s a need to write a post arguing against this.
I haven’t heard anyone else express the extremely naive view we’re talking about that I recall, and I probably have some specific decision-theory-related beliefs that make it particularly appealing to me, but I don’t think it’s out of the ballpark of other people’s views so to speak.
I (probably) agree with this point, and it doesn’t seem like much of an argument against the whole position to me—coming from a Bayesian background, it makes sense to be subjectivist about a lot of things, and link them to your state of knowledge. I’m curious how you would complete the argument—OK, subjunctive statements are linked to subjective states of knowledge. Where does that speak against the naive functionalist position?
“OK, subjunctive statements are linked to subjective states of knowledge. Where does that speak against the naive functionalist position?”—Actually, what I said about relativism isn’t necessarily true. You could assert that any process that is subjunctively linked to what is generally accepted to be a consciousness from any possible reference frame is cognitively identical and hence experiences the same consciousness. But that would include a ridiculous number of things.
By telling you that a box will give the same output as you, we can subjunctively link it to you, even if it is only either a dumb box that immediately outputs true or a dumb box that immediately outputs false. Further, there is no reason why we can’t subjunctively link someone else facing a completely different situation to the same black box, since the box doesn’t actually need to receive the same input as you to be subjunctively linked (this idea is new, I didn’t actually realise that before). So the box would be having the experiences of two people at the same time. This feels like a worse bullet than the one you already want to bite.
The box itself isn’t necessarily thought of as possessing an instance of my consciousness. The bullet I want to bite is the weaker claim that anything subjunctively linked to me has me somewhere in its computation (including its past). In the same way that a transcript of a conversation I had contains me in its computation (I had to speak a word in order for it to end up in the text) but isn’t itself conscious, a box which very reliably has the same output as me must be related to me somehow.
I anticipate that your response is going to be “but what if it is only a little correlated with you?”, to which I would reply “how do we set up the situation?” and probably make a bunch of “you can’t reliably put me into that epistemic state” type objections. In other words, I don’t expect you to be able to make a situation where I both assent to the subjective subjunctive dependence and will want to deny that the box has me somewhere in its computation.
For example, the easiest way to make the correlation weak is for the predictor who tells me the box has the same output as me to be only moderately good. There are several possibilities. (1) I can already predict what the predictor will think I’ll do, which screens off its prediction from my action, so no subjective correlation; (2) I can’t predict confidently what the predictor will say, which means the predictor has information about my action which I lack; then, even if the predictor is poor, it must have a significant tie to me; for example, it might have observed me making similar decisions in the past. So there are copies of me behind the correlation.
“The bullet I want to bite is the weaker claim that anything subjunctively linked to me has me somewhere in its computation (including its past)”—That doesn’t describe this example. You are subjunctively linked to the dumb boxes, but they don’t have you in their past. The thing that has you in its past is the predictor.
I disagree, and I thought my objection was adequately explained. But I think my response will be more concrete/understandable/applicable if you first answer: how do you propose to reliably put an agent into the described situation?
The details of how you set up the scenario may be important to the analysis of the error in the agent’s reasoning. For example, if the agent just thinks the predictor is accurate for no reason, it could be that the agent just has a bad prior (the predictor doesn’t really reliably tell the truth about the agent’s actions being correlated with the box). To that case, I could respond that of course we can construct cases we intuitively disagree with by giving the agent a set of beliefs which we intuitively disagree with. (This is similar to my reason for rejecting the typical smoking lesion setup as a case against EDT! The beliefs given to the EDT agent in smoking lesion are inconsistent with the problem setup.)
I’m not suggesting that you were implying that, I’m just saying it to illustrate why it might be important for you to say more about the setup.
“How do you propose to reliably put an agent into the described situation?”—Why do we have to be able to reliably put an agent in that situation? Isn’t it enough that an agent may end up in that situation?
But in terms of how the agent can know the predictor is accurate, perhaps the agent gets to examine its source code after it has run and its implemented in hardware rather than software so that the agent knows that it wasn’t modified?
But I don’t know why you’re asking so I don’t know if this answers the relevant difficulty.
(Also, just wanted to check whether you’ve read the formal problem description in Logical Counterfactuals and the Co-operation Game)
For example, we can describe how to put an agent into the counterfactual mugging scenario as normally described (where Omega asks for $10 and gives nothing in return), but critically for our analysis, one can only reliably do so by creating a significant chance that the agent ends up in the other branch (where Omega gives the agent a large sum if and only if Omega would have received the asked-for $10 in the other branch). If this were not the case, the argument for giving the $10 would seem weaker.
I’m asking for more detail about how the predictor is constructed such that the predictor can accurately point out that the agent has the same output as the box. Similarly to how counterfactual mugging would be less compelling if we had to rely on the agent happening to have the stated subjunctive dependencies rather than being able to describe a scenario in which it seems very reasonable for the agent to have those subjunctive dependencies, your example would be less compelling if the box just happens to contain a slip of paper with our exact actions, and the predictor just happens to guess this correctly, and we just happen to trust the predictor correctly. Then I would agree that something has gone wrong, but all that has gone wrong is that the agent had a poor picture of the world (one which is subjunctively incorrect from our perspective, even though it made correct predictions).
On the other hand, if the predictor runs a simulation of us, and then purposefully chose a box whose output is identical to ours, then the situation seems perfectly sensible: “the box” that’s correlated with our output subjectively is a box which is chosen differently in cases where our output is different; and, the choice-of-box contains a copy of us. So the example works: there is a copy of us somewhere in the computation which correlates with us.
I’ve read it now. I think you could already have guessed that I agree with the ‘subjective’ point and disagree with the ‘meaningless to consider the case where you have full knowledge’ point.
“”The box” that’s correlated with our output subjectively is a box which is chosen differently in cases where our output is different; and, the choice-of-box contains a copy of us. So the example works”—that’s a good point and if you examine the source code, you’ll know it was choosing between two boxes. Maybe we need an extra layer of indirection. There’s a Truth Tester who can verify that the Predictor is accurate by examining its source code and you only get to examine the Truth Tester’s code, so you never end up seeing the code within the predictor that handles the case where the box doesn’t have the same output as you. As far as you are subjectively concerned, that doesn’t happen.
Ok, so you find yourself in this situation where the Truth Tester has verified that the Predictor is accurate, and you’ve verified that the Truth Tester is accurate, and the Predictor tells you that the direction you’re about to turn your head has a perfect correspondence to the orbit of some particular asteroid. Lacking the orbit information yourself, you now have a subjective link between your next action and the asteroid’s path.
This case does appear to present some difficulty for me.
I think this case isn’t actually so different from the previous case, because although you don’t know the source code of the Predictor, you might reasonably suspect that the Predictor picks out an asteroid after predicting you (or, selects the equation relating your head movement to the asteroid orbit after picking out the asteroid). We might suspect this precisely because it is implausible that the asteroid is actually mirroring our computation in a more significant sense. So using a Truth Teller intermediary increases the uncertainty of the situation, but increased uncertainty is compatible with the same resolution.
What your revision does do, though, is highlight how the counterfactual expectation has to differ from the evidential conditional. We may think “the Predictor would have selected a different asteroid (or different equation) if its computation of our action had turned out different”, but, we now know the asteroid (and the equation); so, our evidential expectation is clearly that the asteroid has a different orbit depending on our choice of action. Yet, it seems like the sensible counterfactual expectation given the situation is … hm.
Actually, now I don’t think it’s quite that the evidential and counterfactual expectation come apart. Since you don’t know what you actually do yet, there’s no reason for you to tie any particular asteroid to any particular action. So, it’s not that in your state of uncertainty choice of action covaries with choice of asteroid (via some particular mapping). Rather, you suspect that there is such a mapping, whatever that means.
In any case, this difficulty was already present without the Truth Teller serving as intermediary: the Predictor’s choice of box is already known, so even though it is sensible to think of the chosen box as what counterfactually varies based on choice of action, on-the-spot what makes sense (evidentially) is to anticipate the same box having different contents.
So, the question is: what’s my naive functionalist position supposed to be? What sense of “varies with” is supposed to necessitate the presence of a copy of me in the (logico-)causal ancestry of an event?
It occurs to me that although I have made clear that I (1) favor naive functionalism and (2) am far from certain of it, I haven’t actually made clear that I further (3) know of no situation where I think the agent has a good picture of the world and where the agent’s picture leads it to conclude that there’s a logical correlation with its action which can’t be accounted for by a logical cause (ie something like a copy of the agent somewhere in the computation of the correlated thing). IE, if there are outright counterexamples to naive functionalism, I think they’re actually tricky to state, and I have at least considered a few cases—your attempted counterexample comes as no surprise to me and I suspect you’ll have to try significantly harder.
My uncertainty is, instead, in the large ambiguity of concepts like “instance of an agent” and “logical cause”.
Ah, I had taken you to be asserting possibilities and a desire to keep those possibilities open rather than held views and a desire for theories to conform to those views.
Maybe something about my view which I should emphasize is that since it doesn’t nail down any particular notion of counterfactual dependence, it doesn’t actually directly bite bullets on specific examples. In a given case where it may seem initially like you want counterfactual dependence but you don’t want anthropic instances to live, you’re free to either change views on one or the other. It could be that a big chunk of our differing intuitions lies in this. I suspect you’ve been thinking of me as wanting to open up the set of anthropic instances much wider than you would want. But, my view is equally amenable to narrowing down the scope of counterfactual dependence, instead. I suspect I’m much more open to narrowing down counterfactual dependence than you might think.
Oh, I completely missed this. That said, I would be highly surprised if these notions were to coincide since they seem like different types. Something for me to think about.