Like causality, intention & agency seems to me intensely tied up with an incomplete and coarse-grained model of the world.
This seems right to me; there’s probably a deep connection between multi-level world models and causality / choices / counterfactuals.
We cannot ask a rock to consider hypothetical scenarios. Neither can we ask an ant to do so.
This seems unclear to me. If I reduce intelligence to circuitry, it looks like the rock is the null circuit that does no information processing, the ant is a simple circuit that does some simple processing, and a human is an very complex circuit that does very complex processing. The rock has no sensors to vary, but the ant does, and thus we could investigate a meaningful counterfactual universe where the ant would behave differently were it presented with different stimuli.
Is the important thing here that the circuitry instantiate some consideration of counterfactual universes in the factual universe? I don’t know enough about ant biology to know whether or not they can ‘imagine’ things in the right sense, but if we consider the simplest circuit that I view as having some measure of ‘intelligence’ or ‘optimization power’ or whatever, the thermostat, it’s clear that the thermostat isn’t doing this sort of counterfactual reasoning (it simply detects whether it’s in state A or B and activates an actuator accordingly).
If so, this looks like trying to ground out ‘what are counterfactuals?’ in terms of the psychology of reasoning: it feels to me like I could have chosen to get something to drink or keep typing, and the interesting thing is where that feeling comes from (and what role it serves and so on). Maybe another way to think of this is something like “what are hypotheticals?”: when I consider a theorem, it seems like the theorem could be true or false, and the process of building out those internal worlds until one collapses is potentially quite different from the standard presentation of a world of Bayesian updating. Similarly, when I consider my behavior, it seems like I could take many actions, and then eventually some actions happen. Even if I never take action A (and never would have, for various low-level deterministic reasons), it’s still part of my hypothetical space, as considered in the real universe. Here, ‘actions I could take’ has some real instantiation, as ‘hypotheticals I’m considering implicitly or explicitly’, complete with my confusions about those actions (“oh, turns out that action was ‘choke on water’ instead of ‘drink water’. Oops.”), as opposed to some Platonic set of possible actions, and the thermostat that isn’t considering hypotheticals is rightly viewed as having ‘no actions’ even tho it’s more reactive than a rock.
This seems promising, but collides with one of the major obstacles I have in thinking about embedded agency; it seems like the descriptive problem of “how am I doing hypothetical reasoning?” is vaguely detached from the prescriptive question of “how should I be doing hypothetical reasoning?” or the idealized question of “what are counterfactuals?”. It’s not obvious that we have an idealized view of ‘set of possible actions’ to approximate, and if we build up from my present reasoning processes, it seems likely that there will be some sort of ontological shift corresponding to an upgrade that might break lots of important guarantees. That said, this may be the best we have to work with.
This seems right to me; there’s probably a deep connection between multi-level world models and causality / choices / counterfactuals.
This seems unclear to me. If I reduce intelligence to circuitry, it looks like the rock is the null circuit that does no information processing, the ant is a simple circuit that does some simple processing, and a human is an very complex circuit that does very complex processing. The rock has no sensors to vary, but the ant does, and thus we could investigate a meaningful counterfactual universe where the ant would behave differently were it presented with different stimuli.
Is the important thing here that the circuitry instantiate some consideration of counterfactual universes in the factual universe? I don’t know enough about ant biology to know whether or not they can ‘imagine’ things in the right sense, but if we consider the simplest circuit that I view as having some measure of ‘intelligence’ or ‘optimization power’ or whatever, the thermostat, it’s clear that the thermostat isn’t doing this sort of counterfactual reasoning (it simply detects whether it’s in state A or B and activates an actuator accordingly).
If so, this looks like trying to ground out ‘what are counterfactuals?’ in terms of the psychology of reasoning: it feels to me like I could have chosen to get something to drink or keep typing, and the interesting thing is where that feeling comes from (and what role it serves and so on). Maybe another way to think of this is something like “what are hypotheticals?”: when I consider a theorem, it seems like the theorem could be true or false, and the process of building out those internal worlds until one collapses is potentially quite different from the standard presentation of a world of Bayesian updating. Similarly, when I consider my behavior, it seems like I could take many actions, and then eventually some actions happen. Even if I never take action A (and never would have, for various low-level deterministic reasons), it’s still part of my hypothetical space, as considered in the real universe. Here, ‘actions I could take’ has some real instantiation, as ‘hypotheticals I’m considering implicitly or explicitly’, complete with my confusions about those actions (“oh, turns out that action was ‘choke on water’ instead of ‘drink water’. Oops.”), as opposed to some Platonic set of possible actions, and the thermostat that isn’t considering hypotheticals is rightly viewed as having ‘no actions’ even tho it’s more reactive than a rock.
This seems promising, but collides with one of the major obstacles I have in thinking about embedded agency; it seems like the descriptive problem of “how am I doing hypothetical reasoning?” is vaguely detached from the prescriptive question of “how should I be doing hypothetical reasoning?” or the idealized question of “what are counterfactuals?”. It’s not obvious that we have an idealized view of ‘set of possible actions’ to approximate, and if we build up from my present reasoning processes, it seems likely that there will be some sort of ontological shift corresponding to an upgrade that might break lots of important guarantees. That said, this may be the best we have to work with.