You are assuming some relationship between agency and free will that has not been spelt out. Also, an entirely woo-free notion of agency is a ubiquitous topic on this site, as has been pointed out to you before.
In my view, agency is sort of like life—it’s hard to define itself, but the results are fairly obvious. Life tends to spread to fill all possible niches. Agents tend to steer the world toward certain states. But this only shows that you don’t need top-down causation to notice an agent (ignoring how, hypothetically, “notice” is a top-down process; what it means is that “is this an agent” is a fairly objective and non-agenty decision problem).
How can you affect lower levels by doing things on higher levels? By “doing things on a higher level”, what you are really doing is changing the lower level so that it appears a certain way on a higher level.
If what you say is correct, we should expect MIRI to claim that two atom-level identical computers could nonetheless differ in Agency. I strongly predict the opposite—MIRI’s viewpoint is reductionist and physicalist, to the best of my knowledge.
I am not denying that the more complex an animal is, the more agency it appears to possess. On the other hand, the more we know about an agent, the less agency it appears to possess. What ancients thought of as agent-driven behavior, we see as natural phenomena not associated with free will or decision making. We still tend to anthropomorphize natural phenomena a lot (e.g. evolution, fortune), often implicitly assigning agency to them without realizing or admitting it. Teleology can at times be a useful model, of course, even in physics, but especially in programming.
It is also ought to be obvious, but isn’t, that there is a big disconnect between “I decide” and “I am an algorithm”. You can often read here and even in MIRI papers that agents can act contrary to their programming (that’s where counterfactuals show up). A quote from Abram Demski:
Suppose you know that you take the $10. How do you reason about what would happen if you took the $5 instead?
Your prediction, as far as I can tell, has been falsified. An agent magically steps away from its own programming by thinking about counterfactuals.
Either you are right that “you know that you take the $10”, or you are mistaken about this knowledge, not both, unless you subscribe to the model of changing a programming certainty.
1. Are you saying that the idea of a counterfactual inherently requires Transcending Programming, or that thinking about personal counterfactuals requires ignoring the fact that you are programmed?
2. Counterfactuals aren’t real. They do not correspond to logical possibilities. That is what the word means—they are “counter” to “fact”. But in the absence of perfect self-knowledge, and even in the knowledge that one is fully deterministic, you can still not know what you are going to do. So you’re not required to think about something that you know would require Transcending Programming, even if it is objectively the case that you would have to Transcend Programming to do that in reality.
I posted about it before here. Logical Counterfactuals are low-res. I think you are saying the same thing here. And yes, analyzing one’s own decision-making algorithms and adjusting them can be very useful. However, Abtam’s statement, as I understand it, does not have the explicit qualifier of incomplete knowledge of self. Quite the opposite, it says “Suppose you know that you take the $10”, not “You start with a first approximation that you take $10 and then explore further”.
You’re right—I didn’t see my confusion before, but Demski’s views don’t actually make much sense to me. The agent knows for certain that it will take $X? How can it know that without simulating its decision process? But if “simulate what my decision process is, then use that as the basis for counterfactuals” is part of the decision process, you’d get infinite regress. (Possible connection to fixed points?)
I don’t think Demski is saying that the agent would magically jump from taking $X to taking $Y. I think he’s saying that agents which fully understand their own behavior would be trapped by this knowledge because they can no longer form “reasonable” counterfactuals. I don’t think he’d claim that Agenthood can override fundamental physics, and I don’t see how you’re arguing that his beliefs, unbeknownst to him, are based on the assumption that Agenthood can override fundamental physics.
I cannot read his mind, odds are, I misinterpreted what he meant. But if MIRI doesn’t think that counterfactuals as they appear to be (“I could have made a different decision but didn’t, by choice”) are fundamental, then I would expect a careful analysis of that issue somewhere. Maybe I missed it. I have posted on a related topic some five months ago, and had some interesting feedback from jessicata (Jessica Tailor of MIRI) in the comments.
I would like to see at least some ideas as to how agency can arise without top-down causation.
You are assuming some relationship between agency and free will that has not been spelt out. Also, an entirely woo-free notion of agency is a ubiquitous topic on this site, as has been pointed out to you before.
I must have missed it or it didn’t make sense to me...
In my view, agency is sort of like life—it’s hard to define itself, but the results are fairly obvious. Life tends to spread to fill all possible niches. Agents tend to steer the world toward certain states. But this only shows that you don’t need top-down causation to notice an agent (ignoring how, hypothetically, “notice” is a top-down process; what it means is that “is this an agent” is a fairly objective and non-agenty decision problem).
How can you affect lower levels by doing things on higher levels? By “doing things on a higher level”, what you are really doing is changing the lower level so that it appears a certain way on a higher level.
If what you say is correct, we should expect MIRI to claim that two atom-level identical computers could nonetheless differ in Agency. I strongly predict the opposite—MIRI’s viewpoint is reductionist and physicalist, to the best of my knowledge.
I am not denying that the more complex an animal is, the more agency it appears to possess. On the other hand, the more we know about an agent, the less agency it appears to possess. What ancients thought of as agent-driven behavior, we see as natural phenomena not associated with free will or decision making. We still tend to anthropomorphize natural phenomena a lot (e.g. evolution, fortune), often implicitly assigning agency to them without realizing or admitting it. Teleology can at times be a useful model, of course, even in physics, but especially in programming.
It is also ought to be obvious, but isn’t, that there is a big disconnect between “I decide” and “I am an algorithm”. You can often read here and even in MIRI papers that agents can act contrary to their programming (that’s where counterfactuals show up). A quote from Abram Demski:
Your prediction, as far as I can tell, has been falsified. An agent magically steps away from its own programming by thinking about counterfactuals.
It’s been programmed to think about counterfactuals.
Either you are right that “you know that you take the $10”, or you are mistaken about this knowledge, not both, unless you subscribe to the model of changing a programming certainty.
1. Are you saying that the idea of a counterfactual inherently requires Transcending Programming, or that thinking about personal counterfactuals requires ignoring the fact that you are programmed?
2. Counterfactuals aren’t real. They do not correspond to logical possibilities. That is what the word means—they are “counter” to “fact”. But in the absence of perfect self-knowledge, and even in the knowledge that one is fully deterministic, you can still not know what you are going to do. So you’re not required to think about something that you know would require Transcending Programming, even if it is objectively the case that you would have to Transcend Programming to do that in reality.
I posted about it before here. Logical Counterfactuals are low-res. I think you are saying the same thing here. And yes, analyzing one’s own decision-making algorithms and adjusting them can be very useful. However, Abtam’s statement, as I understand it, does not have the explicit qualifier of incomplete knowledge of self. Quite the opposite, it says “Suppose you know that you take the $10”, not “You start with a first approximation that you take $10 and then explore further”.
You’re right—I didn’t see my confusion before, but Demski’s views don’t actually make much sense to me. The agent knows for certain that it will take $X? How can it know that without simulating its decision process? But if “simulate what my decision process is, then use that as the basis for counterfactuals” is part of the decision process, you’d get infinite regress. (Possible connection to fixed points?)
I don’t think Demski is saying that the agent would magically jump from taking $X to taking $Y. I think he’s saying that agents which fully understand their own behavior would be trapped by this knowledge because they can no longer form “reasonable” counterfactuals. I don’t think he’d claim that Agenthood can override fundamental physics, and I don’t see how you’re arguing that his beliefs, unbeknownst to him, are based on the assumption that Agenthood can override fundamental physics.