Both of us are thinking about how to write a decision theory library.
That makes your position a lot clearer. I admit that the Abstraction Approach makes things more complicated and that this might affect what you can accomplish either theoretically or practically by using the Reductive Approach, so I could see some value in exploring this path. For Stuart Armstrong’s paper in particular, the Abstraction Approach wouldn’t really add much in the way of complications and it would make it much clearer what was going on. But maybe there are other things you are looking into where it wouldn’t be anywhere near this easy. But in any case, I’d prefer people to use the Abstraction Approach in the cases where it is easy to do so.
An argument in favor of naive functionalism makes applying the abstraction approach less appealing
True, and I can imagine a level of likelihood below which adopting the Abstraction Approach would be adding needless complexity and mostly be a waste of time.
I think it is worth making a distinction between complexity in the practical sense and complexity in the hypothetical sense. In the practical sense, using the Abstraction Approach with Naive Functionalism is more complex than the Reductive Approach. In the hypothetical sense, they are equally complex in term of explaining how anthropics works given Naive Functionalism as we haven’t postulated anything additional within this particular domain (you may say that we’ve postulated consciousness, but within this assumption it’s just a renaming of a term, rather than the introduction of an extra entity). I believe that Occam’s Razor should be concerned with the later type of complexity, which is why I wouldn’t consider it a good argument for the Reductive Approach.
But that you strongly prefer to abstract in this case
I’m very negative on Naive Functionalism. I’ve still got some skepticism about functionalism itself (property dualism isn’t implausible in my mind), but if I had to choose between Functionalist theories, that certainly isn’t what I’d pick.
I’m very negative on Naive Functionalism. I’ve still got some skepticism about functionalism itself (property dualism isn’t implausible in my mind), but if I had to choose between Functionalist theories, that certainly isn’t what I’d pick.
I’m trying to think more about why I feel this outcome is a somewhat plausible one. The thing I’m generating is a feeling that this is ‘how these things go’—that the sign that you’re on the right track is when all the concepts start fitting together like legos.
I guess I also find it kind of curious that you aren’t more compelled by the argument I made early on, namely, that we should collapse apparently distinct notions if we can’t give any cognitive difference between them. I think I later rounded down this argument to occam’s razor, but there’s a different point to be made: if we’re talking about the cognitive role played by something, rather than just the definition (as is the case in decision theory), and we can’t find a difference in cognitive role (even if we generally make a distinction when making definitions), it seems hard to sustain the distinction. Taking another example related to anthropics, it seems hard to sustain a distinction between ‘probability that I’m an instance’ and ‘degree I care about each instance’ (what’s been called a ‘caring measure’ I think), when all the calculations come out the same either way, even generating something which looks like a Bayesian update of the caring measure. Initially it seems like there’s a big difference, because it’s a question of modeling something as a belief or a value; but, unless some substantive difference in the actual computations presents itself, it seems the distinction isn’t real. A robot built to think with true anthropic uncertainty vs caring measures is literally running equivalent code either way; it’s effectively only a difference in code comment.
“Namely, that we should collapse apparently distinct notions if we can’t give any cognitive difference between them”—I don’t necessarily agree that being subjunctively linked to you (such that it gives the same result) is the same as being cognitively identical, so this argument doesn’t get off the ground for me. If adopt a functionalist theory, it seems quite plausible that the degree of complexity is important too (although perhaps you’d say that isn’t pure functionalism?)
It might be helpful to relate this to the argument I made in Logical Counterfactuals and the Cooperation Game. The point I make there is that the processes are subjunctively linked to you is more a matter of your state of knowledge than anything about the intrinsic properties of the object itself. So if you adopt the position that things that are subjunctively linked to you are cognitively and hence consciously the same, you end up with a highly relativistic viewpoint.
I’m curious, how much do people at MIRI lean towards naive functionalism? I’m mainly asking because I’m trying to figure out whether there’s a need to write a post arguing against this.
I haven’t heard anyone else express the extremely naive view we’re talking about that I recall, and I probably have some specific decision-theory-related beliefs that make it particularly appealing to me, but I don’t think it’s out of the ballpark of other people’s views so to speak.
The point I make there is that the processes are subjunctively linked to you is more a matter of your state of knowledge than anything about the intrinsic properties of the object itself.
I (probably) agree with this point, and it doesn’t seem like much of an argument against the whole position to me—coming from a Bayesian background, it makes sense to be subjectivist about a lot of things, and link them to your state of knowledge. I’m curious how you would complete the argument—OK, subjunctive statements are linked to subjective states of knowledge. Where does that speak against the naive functionalist position?
“OK, subjunctive statements are linked to subjective states of knowledge. Where does that speak against the naive functionalist position?”—Actually, what I said about relativism isn’t necessarily true. You could assert that any process that is subjunctively linked to what is generally accepted to be a consciousness from any possible reference frame is cognitively identical and hence experiences the same consciousness. But that would include a ridiculous number of things.
By telling you that a box will give the same output as you, we can subjunctively link it to you, even if it is only either a dumb box that immediately outputs true or a dumb box that immediately outputs false. Further, there is no reason why we can’t subjunctively link someone else facing a completely different situation to the same black box, since the box doesn’t actually need to receive the same input as you to be subjunctively linked (this idea is new, I didn’t actually realise that before). So the box would be having the experiences of two people at the same time. This feels like a worse bullet than the one you already want to bite.
The box itself isn’t necessarily thought of as possessing an instance of my consciousness. The bullet I want to bite is the weaker claim that anything subjunctively linked to me has me somewhere in its computation (including its past). In the same way that a transcript of a conversation I had contains me in its computation (I had to speak a word in order for it to end up in the text) but isn’t itself conscious, a box which very reliably has the same output as me must be related to me somehow.
I anticipate that your response is going to be “but what if it is only a little correlated with you?”, to which I would reply “how do we set up the situation?” and probably make a bunch of “you can’t reliably put me into that epistemic state” type objections. In other words, I don’t expect you to be able to make a situation where I both assent to the subjective subjunctive dependence and will want to deny that the box has me somewhere in its computation.
For example, the easiest way to make the correlation weak is for the predictor who tells me the box has the same output as me to be only moderately good. There are several possibilities. (1) I can already predict what the predictor will think I’ll do, which screens off its prediction from my action, so no subjective correlation; (2) I can’t predict confidently what the predictor will say, which means the predictor has information about my action which I lack; then, even if the predictor is poor, it must have a significant tie to me; for example, it might have observed me making similar decisions in the past. So there are copies of me behind the correlation.
“The bullet I want to bite is the weaker claim that anything subjunctively linked to me has me somewhere in its computation (including its past)”—That doesn’t describe this example. You are subjunctively linked to the dumb boxes, but they don’t have you in their past. The thing that has you in its past is the predictor.
I disagree, and I thought my objection was adequately explained. But I think my response will be more concrete/understandable/applicable if you first answer: how do you propose to reliably put an agent into the described situation?
The details of how you set up the scenario may be important to the analysis of the error in the agent’s reasoning. For example, if the agent just thinks the predictor is accurate for no reason, it could be that the agent just has a bad prior (the predictor doesn’t really reliably tell the truth about the agent’s actions being correlated with the box). To that case, I could respond that of course we can construct cases we intuitively disagree with by giving the agent a set of beliefs which we intuitively disagree with. (This is similar to my reason for rejecting the typical smoking lesion setup as a case against EDT! The beliefs given to the EDT agent in smoking lesion are inconsistent with the problem setup.)
I’m not suggesting that you were implying that, I’m just saying it to illustrate why it might be important for you to say more about the setup.
“How do you propose to reliably put an agent into the described situation?”—Why do we have to be able to reliably put an agent in that situation? Isn’t it enough that an agent may end up in that situation?
But in terms of how the agent can know the predictor is accurate, perhaps the agent gets to examine its source code after it has run and its implemented in hardware rather than software so that the agent knows that it wasn’t modified?
But I don’t know why you’re asking so I don’t know if this answers the relevant difficulty.
“How do you propose to reliably put an agent into the described situation?”—Why do we have to be able to reliably put an agent in that situation? Isn’t it enough that an agent may end up in that situation?
For example, we can describe how to put an agent into the counterfactual mugging scenario as normally described (where Omega asks for $10 and gives nothing in return), but critically for our analysis, one can only reliably do so by creating a significant chance that the agent ends up in the other branch (where Omega gives the agent a large sum if and only if Omega would have received the asked-for $10 in the other branch). If this were not the case, the argument for giving the $10 would seem weaker.
But in terms of how the agent can know the predictor is accurate, perhaps the agent gets to examine its source code after it has run and its implemented in hardware rather than software so that the agent knows that it wasn’t modified?
I’m asking for more detail about how the predictor is constructed such that the predictor can accurately point out that the agent has the same output as the box. Similarly to how counterfactual mugging would be less compelling if we had to rely on the agent happening to have the stated subjunctive dependencies rather than being able to describe a scenario in which it seems very reasonable for the agent to have those subjunctive dependencies, your example would be less compelling if the box just happens to contain a slip of paper with our exact actions, and the predictor just happens to guess this correctly, and we just happen to trust the predictor correctly. Then I would agree that something has gone wrong, but all that has gone wrong is that the agent had a poor picture of the world (one which is subjunctively incorrect from our perspective, even though it made correct predictions).
On the other hand, if the predictor runs a simulation of us, and then purposefully chose a box whose output is identical to ours, then the situation seems perfectly sensible: “the box” that’s correlated with our output subjectively is a box which is chosen differently in cases where our output is different; and, the choice-of-box contains a copy of us. So the example works: there is a copy of us somewhere in the computation which correlates with us.
I’ve read it now. I think you could already have guessed that I agree with the ‘subjective’ point and disagree with the ‘meaningless to consider the case where you have full knowledge’ point.
“”The box” that’s correlated with our output subjectively is a box which is chosen differently in cases where our output is different; and, the choice-of-box contains a copy of us. So the example works”—that’s a good point and if you examine the source code, you’ll know it was choosing between two boxes. Maybe we need an extra layer of indirection. There’s a Truth Tester who can verify that the Predictor is accurate by examining its source code and you only get to examine the Truth Tester’s code, so you never end up seeing the code within the predictor that handles the case where the box doesn’t have the same output as you. As far as you are subjectively concerned, that doesn’t happen.
Ok, so you find yourself in this situation where the Truth Tester has verified that the Predictor is accurate, and you’ve verified that the Truth Tester is accurate, and the Predictor tells you that the direction you’re about to turn your head has a perfect correspondence to the orbit of some particular asteroid. Lacking the orbit information yourself, you now have a subjective link between your next action and the asteroid’s path.
This case does appear to present some difficulty for me.
I think this case isn’t actually so different from the previous case, because although you don’t know the source code of the Predictor, you might reasonably suspect that the Predictor picks out an asteroid after predicting you (or, selects the equation relating your head movement to the asteroid orbit after picking out the asteroid). We might suspect this precisely because it is implausible that the asteroid is actually mirroring our computation in a more significant sense. So using a Truth Teller intermediary increases the uncertainty of the situation, but increased uncertainty is compatible with the same resolution.
What your revision does do, though, is highlight how the counterfactual expectation has to differ from the evidential conditional. We may think “the Predictor would have selected a different asteroid (or different equation) if its computation of our action had turned out different”, but, we now know the asteroid (and the equation); so, our evidential expectation is clearly that the asteroid has a different orbit depending on our choice of action. Yet, it seems like the sensible counterfactual expectation given the situation is … hm.
Actually, now I don’t think it’s quite that the evidential and counterfactual expectation come apart. Since you don’t know what you actually do yet, there’s no reason for you to tie any particular asteroid to any particular action. So, it’s not that in your state of uncertainty choice of action covaries with choice of asteroid (via some particular mapping). Rather, you suspect that there is such a mapping, whatever that means.
In any case, this difficulty was already present without the Truth Teller serving as intermediary: the Predictor’s choice of box is already known, so even though it is sensible to think of the chosen box as what counterfactually varies based on choice of action, on-the-spot what makes sense (evidentially) is to anticipate the same box having different contents.
So, the question is: what’s my naive functionalist position supposed to be? What sense of “varies with” is supposed to necessitate the presence of a copy of me in the (logico-)causal ancestry of an event?
It occurs to me that although I have made clear that I (1) favor naive functionalism and (2) am far from certain of it, I haven’t actually made clear that I further (3) know of no situation where I think the agent has a good picture of the world and where the agent’s picture leads it to conclude that there’s a logical correlation with its action which can’t be accounted for by a logical cause (ie something like a copy of the agent somewhere in the computation of the correlated thing). IE, if there are outright counterexamples to naive functionalism, I think they’re actually tricky to state, and I have at least considered a few cases—your attempted counterexample comes as no surprise to me and I suspect you’ll have to try significantly harder.
My uncertainty is, instead, in the large ambiguity of concepts like “instance of an agent” and “logical cause”.
That makes your position a lot clearer. I admit that the Abstraction Approach makes things more complicated and that this might affect what you can accomplish either theoretically or practically by using the Reductive Approach, so I could see some value in exploring this path. For Stuart Armstrong’s paper in particular, the Abstraction Approach wouldn’t really add much in the way of complications and it would make it much clearer what was going on. But maybe there are other things you are looking into where it wouldn’t be anywhere near this easy. But in any case, I’d prefer people to use the Abstraction Approach in the cases where it is easy to do so.
True, and I can imagine a level of likelihood below which adopting the Abstraction Approach would be adding needless complexity and mostly be a waste of time.
I think it is worth making a distinction between complexity in the practical sense and complexity in the hypothetical sense. In the practical sense, using the Abstraction Approach with Naive Functionalism is more complex than the Reductive Approach. In the hypothetical sense, they are equally complex in term of explaining how anthropics works given Naive Functionalism as we haven’t postulated anything additional within this particular domain (you may say that we’ve postulated consciousness, but within this assumption it’s just a renaming of a term, rather than the introduction of an extra entity). I believe that Occam’s Razor should be concerned with the later type of complexity, which is why I wouldn’t consider it a good argument for the Reductive Approach.
I’m very negative on Naive Functionalism. I’ve still got some skepticism about functionalism itself (property dualism isn’t implausible in my mind), but if I had to choose between Functionalist theories, that certainly isn’t what I’d pick.
I’m trying to think more about why I feel this outcome is a somewhat plausible one. The thing I’m generating is a feeling that this is ‘how these things go’—that the sign that you’re on the right track is when all the concepts start fitting together like legos.
I guess I also find it kind of curious that you aren’t more compelled by the argument I made early on, namely, that we should collapse apparently distinct notions if we can’t give any cognitive difference between them. I think I later rounded down this argument to occam’s razor, but there’s a different point to be made: if we’re talking about the cognitive role played by something, rather than just the definition (as is the case in decision theory), and we can’t find a difference in cognitive role (even if we generally make a distinction when making definitions), it seems hard to sustain the distinction. Taking another example related to anthropics, it seems hard to sustain a distinction between ‘probability that I’m an instance’ and ‘degree I care about each instance’ (what’s been called a ‘caring measure’ I think), when all the calculations come out the same either way, even generating something which looks like a Bayesian update of the caring measure. Initially it seems like there’s a big difference, because it’s a question of modeling something as a belief or a value; but, unless some substantive difference in the actual computations presents itself, it seems the distinction isn’t real. A robot built to think with true anthropic uncertainty vs caring measures is literally running equivalent code either way; it’s effectively only a difference in code comment.
“Namely, that we should collapse apparently distinct notions if we can’t give any cognitive difference between them”—I don’t necessarily agree that being subjunctively linked to you (such that it gives the same result) is the same as being cognitively identical, so this argument doesn’t get off the ground for me. If adopt a functionalist theory, it seems quite plausible that the degree of complexity is important too (although perhaps you’d say that isn’t pure functionalism?)
It might be helpful to relate this to the argument I made in Logical Counterfactuals and the Cooperation Game. The point I make there is that the processes are subjunctively linked to you is more a matter of your state of knowledge than anything about the intrinsic properties of the object itself. So if you adopt the position that things that are subjunctively linked to you are cognitively and hence consciously the same, you end up with a highly relativistic viewpoint.
I’m curious, how much do people at MIRI lean towards naive functionalism? I’m mainly asking because I’m trying to figure out whether there’s a need to write a post arguing against this.
I haven’t heard anyone else express the extremely naive view we’re talking about that I recall, and I probably have some specific decision-theory-related beliefs that make it particularly appealing to me, but I don’t think it’s out of the ballpark of other people’s views so to speak.
I (probably) agree with this point, and it doesn’t seem like much of an argument against the whole position to me—coming from a Bayesian background, it makes sense to be subjectivist about a lot of things, and link them to your state of knowledge. I’m curious how you would complete the argument—OK, subjunctive statements are linked to subjective states of knowledge. Where does that speak against the naive functionalist position?
“OK, subjunctive statements are linked to subjective states of knowledge. Where does that speak against the naive functionalist position?”—Actually, what I said about relativism isn’t necessarily true. You could assert that any process that is subjunctively linked to what is generally accepted to be a consciousness from any possible reference frame is cognitively identical and hence experiences the same consciousness. But that would include a ridiculous number of things.
By telling you that a box will give the same output as you, we can subjunctively link it to you, even if it is only either a dumb box that immediately outputs true or a dumb box that immediately outputs false. Further, there is no reason why we can’t subjunctively link someone else facing a completely different situation to the same black box, since the box doesn’t actually need to receive the same input as you to be subjunctively linked (this idea is new, I didn’t actually realise that before). So the box would be having the experiences of two people at the same time. This feels like a worse bullet than the one you already want to bite.
The box itself isn’t necessarily thought of as possessing an instance of my consciousness. The bullet I want to bite is the weaker claim that anything subjunctively linked to me has me somewhere in its computation (including its past). In the same way that a transcript of a conversation I had contains me in its computation (I had to speak a word in order for it to end up in the text) but isn’t itself conscious, a box which very reliably has the same output as me must be related to me somehow.
I anticipate that your response is going to be “but what if it is only a little correlated with you?”, to which I would reply “how do we set up the situation?” and probably make a bunch of “you can’t reliably put me into that epistemic state” type objections. In other words, I don’t expect you to be able to make a situation where I both assent to the subjective subjunctive dependence and will want to deny that the box has me somewhere in its computation.
For example, the easiest way to make the correlation weak is for the predictor who tells me the box has the same output as me to be only moderately good. There are several possibilities. (1) I can already predict what the predictor will think I’ll do, which screens off its prediction from my action, so no subjective correlation; (2) I can’t predict confidently what the predictor will say, which means the predictor has information about my action which I lack; then, even if the predictor is poor, it must have a significant tie to me; for example, it might have observed me making similar decisions in the past. So there are copies of me behind the correlation.
“The bullet I want to bite is the weaker claim that anything subjunctively linked to me has me somewhere in its computation (including its past)”—That doesn’t describe this example. You are subjunctively linked to the dumb boxes, but they don’t have you in their past. The thing that has you in its past is the predictor.
I disagree, and I thought my objection was adequately explained. But I think my response will be more concrete/understandable/applicable if you first answer: how do you propose to reliably put an agent into the described situation?
The details of how you set up the scenario may be important to the analysis of the error in the agent’s reasoning. For example, if the agent just thinks the predictor is accurate for no reason, it could be that the agent just has a bad prior (the predictor doesn’t really reliably tell the truth about the agent’s actions being correlated with the box). To that case, I could respond that of course we can construct cases we intuitively disagree with by giving the agent a set of beliefs which we intuitively disagree with. (This is similar to my reason for rejecting the typical smoking lesion setup as a case against EDT! The beliefs given to the EDT agent in smoking lesion are inconsistent with the problem setup.)
I’m not suggesting that you were implying that, I’m just saying it to illustrate why it might be important for you to say more about the setup.
“How do you propose to reliably put an agent into the described situation?”—Why do we have to be able to reliably put an agent in that situation? Isn’t it enough that an agent may end up in that situation?
But in terms of how the agent can know the predictor is accurate, perhaps the agent gets to examine its source code after it has run and its implemented in hardware rather than software so that the agent knows that it wasn’t modified?
But I don’t know why you’re asking so I don’t know if this answers the relevant difficulty.
(Also, just wanted to check whether you’ve read the formal problem description in Logical Counterfactuals and the Co-operation Game)
For example, we can describe how to put an agent into the counterfactual mugging scenario as normally described (where Omega asks for $10 and gives nothing in return), but critically for our analysis, one can only reliably do so by creating a significant chance that the agent ends up in the other branch (where Omega gives the agent a large sum if and only if Omega would have received the asked-for $10 in the other branch). If this were not the case, the argument for giving the $10 would seem weaker.
I’m asking for more detail about how the predictor is constructed such that the predictor can accurately point out that the agent has the same output as the box. Similarly to how counterfactual mugging would be less compelling if we had to rely on the agent happening to have the stated subjunctive dependencies rather than being able to describe a scenario in which it seems very reasonable for the agent to have those subjunctive dependencies, your example would be less compelling if the box just happens to contain a slip of paper with our exact actions, and the predictor just happens to guess this correctly, and we just happen to trust the predictor correctly. Then I would agree that something has gone wrong, but all that has gone wrong is that the agent had a poor picture of the world (one which is subjunctively incorrect from our perspective, even though it made correct predictions).
On the other hand, if the predictor runs a simulation of us, and then purposefully chose a box whose output is identical to ours, then the situation seems perfectly sensible: “the box” that’s correlated with our output subjectively is a box which is chosen differently in cases where our output is different; and, the choice-of-box contains a copy of us. So the example works: there is a copy of us somewhere in the computation which correlates with us.
I’ve read it now. I think you could already have guessed that I agree with the ‘subjective’ point and disagree with the ‘meaningless to consider the case where you have full knowledge’ point.
“”The box” that’s correlated with our output subjectively is a box which is chosen differently in cases where our output is different; and, the choice-of-box contains a copy of us. So the example works”—that’s a good point and if you examine the source code, you’ll know it was choosing between two boxes. Maybe we need an extra layer of indirection. There’s a Truth Tester who can verify that the Predictor is accurate by examining its source code and you only get to examine the Truth Tester’s code, so you never end up seeing the code within the predictor that handles the case where the box doesn’t have the same output as you. As far as you are subjectively concerned, that doesn’t happen.
Ok, so you find yourself in this situation where the Truth Tester has verified that the Predictor is accurate, and you’ve verified that the Truth Tester is accurate, and the Predictor tells you that the direction you’re about to turn your head has a perfect correspondence to the orbit of some particular asteroid. Lacking the orbit information yourself, you now have a subjective link between your next action and the asteroid’s path.
This case does appear to present some difficulty for me.
I think this case isn’t actually so different from the previous case, because although you don’t know the source code of the Predictor, you might reasonably suspect that the Predictor picks out an asteroid after predicting you (or, selects the equation relating your head movement to the asteroid orbit after picking out the asteroid). We might suspect this precisely because it is implausible that the asteroid is actually mirroring our computation in a more significant sense. So using a Truth Teller intermediary increases the uncertainty of the situation, but increased uncertainty is compatible with the same resolution.
What your revision does do, though, is highlight how the counterfactual expectation has to differ from the evidential conditional. We may think “the Predictor would have selected a different asteroid (or different equation) if its computation of our action had turned out different”, but, we now know the asteroid (and the equation); so, our evidential expectation is clearly that the asteroid has a different orbit depending on our choice of action. Yet, it seems like the sensible counterfactual expectation given the situation is … hm.
Actually, now I don’t think it’s quite that the evidential and counterfactual expectation come apart. Since you don’t know what you actually do yet, there’s no reason for you to tie any particular asteroid to any particular action. So, it’s not that in your state of uncertainty choice of action covaries with choice of asteroid (via some particular mapping). Rather, you suspect that there is such a mapping, whatever that means.
In any case, this difficulty was already present without the Truth Teller serving as intermediary: the Predictor’s choice of box is already known, so even though it is sensible to think of the chosen box as what counterfactually varies based on choice of action, on-the-spot what makes sense (evidentially) is to anticipate the same box having different contents.
So, the question is: what’s my naive functionalist position supposed to be? What sense of “varies with” is supposed to necessitate the presence of a copy of me in the (logico-)causal ancestry of an event?
It occurs to me that although I have made clear that I (1) favor naive functionalism and (2) am far from certain of it, I haven’t actually made clear that I further (3) know of no situation where I think the agent has a good picture of the world and where the agent’s picture leads it to conclude that there’s a logical correlation with its action which can’t be accounted for by a logical cause (ie something like a copy of the agent somewhere in the computation of the correlated thing). IE, if there are outright counterexamples to naive functionalism, I think they’re actually tricky to state, and I have at least considered a few cases—your attempted counterexample comes as no surprise to me and I suspect you’ll have to try significantly harder.
My uncertainty is, instead, in the large ambiguity of concepts like “instance of an agent” and “logical cause”.