So then “omega” (in quotes because am assuming a slightly less omniscient being than usually implied by the name) would be computing the implications of one algorithm, while your output would effectively be the output if a different algorithm. So that weakens the correlation that justifies PD coopoperation, Newcomb one-boxing, etc etc etc etc...
Yes, but like I said the first time around, this would be a rare event, rare enough to be discounted if all that Omega cares about is maximizing the chance of guessing correctly. If Omega has some other preferences over the outcomes (a “safe side” it wants to err on), and if the chance is large enough, it may have to change its choice based on this possibility.
So, here’s what I have your preferred representation as:
“Platonic space of algorithms” and “innards” both point to “selector” (the actual space of algorithms influences the selector, I assume); “innards” and “Platonic space” also together point to “Omega’s prediction”, but selector does not, because your omega can’t see the things that can cause it to err. Then, “Omega’s prediction” points to box content and selector points to your choice. Then, of course, box content and your choice point to payout.
Further, you say the choice the agent makes is at the innards node.
Even if rare, the decision theory used should at least be able to THINK ABOUT THE IDEA of a hardware error or such. Even if it dismisses it as not worth considering, it should at least have some means of describing the situation. ie, I am capable of at least considering the possibility of me having brain damage or whatever. Our decision theory should be capable of no less.
Sorry if I’m unclear here, but my focus isn’t so much on omega as trying to get a version of TDT that can at least represent that sort of situation.
You seem to more or less have it right. Except I’d place the choice more at the selector or at the “node that represents the specific abstract algorithm that actually gets used”
As per TDT, choose as if you get to decide what the output for the abstract algorithm should be. The catch is that here there’s a bit of uncertainty as to which abstract algorithm is being computed. So if, due to cosmic ray striking and causing a bitflip at a certain point in the computation, you end up actually computing algorithm 1B while omega models you as being algorithm 1A, then that’d be potentially a weakening of the dependence. (Again, just using the Newcomb problem simply as a way of talking about this.)
You seem to more or less have it right. Except I’d place the choice more at the selector or at the “node that represents the specific abstract algorithm that actually gets used”.
Okay, so there’d be another node between “algorithm selector” and “your choice of box”; that would still be an uninterrupted path (chain) and so doesn’t affect the result.
The problem, then, is that if you take the agent’s choice as being at “algorithm selector”, or any descendant through “your choice of box”, you’ve d-separated “your choice of box” from “Omega’s prediction”, meaning that Omega’s prediction is conditionally independent of “your choice of box”, given the agent’s choice. (Be careful to distinguish “your choice of box” from where we’re saying the agent is making a choice.)
But then, we know that’s not true, and it would reduce your model to the “innards CSA” that AnnaSalamon gave above. (The parent of “your choice of box” has no parents.)
So I don’t think that’s an accurate representation of the situation, or consistent with TDT. So the agent’s choice must be occuring at the “innards node” in your graph.
(Note: this marks the first time I’ve drawn a causal Bayesian network and used the concept of d-separation to approach a new problem. w00t! And yes, this would be easier if I uploaded pictures as I went.)
Okay, so there’d be another node between “algorithm selector” and “your choice of box”;
Not sure where you’re getting that extra node from. The agent’s choice is the output of the abstract algorithm they actually end up computing as a result of all the physical processes that occur.
Abstract algorithm space feeds into both your algorithm selector node and the algorithm selector node in “omega”’s model of you. That’s where the dependence comes from.
So given logical uncertainty about the output of the algorithm, wouldn’t they be d-connected? They’d be d-separated if the choice was already known… but if it was, there’d be nothing left to choose, right? No uncertainties to be dependent on each other in the first place.
Actually, maybe I ought draw a diagram of what I have in mind and upload to imgur or whatever.
Alright, after thinking about your points some more, and refining the graph, here’s my best attempt to generate one that includes your concerns: Link.
Per AnnaSalamon’s convention, the agent’s would-node-surgery is in a square box, with the rest elliptical and the payoff octagonal. Some nodes included for clarity that would normally be left out. Dotted lines indicate edges that are cut for surgery when fixing “would” node. One link I wasn’t sure about has a ”?”, but it’s not that important.
Important points: The cutting of parents for the agent’s decision preserves d-connection between box choice and box content. Omega observes innards and attempted selection of algorithm but retains uncertainty as to how the actual algorithm plays out. Innards contribute to hardware failures to accurately implement algorithm (as do [unshown] exogenous factors).
And I do hope you follow up, given my efforts to help you spell out your point.
Just placing this here now as sort of a promise to follow up. Just that I’m running on insufficient sleep, so can only do “easy stuff” at the moment. :) I certainly plan on following up on our conversation in more detail, once I get a good night’s sleep.
Having looked at your diagram now, that’s not quite what I have in mind. For instance, “what I attempt to implement” is kinda an “innards” issue rather than deserving a separate box in this context.
Actually, I realized that what I want to do is kind of weird, sort of amounting to doing surgery on a node while being uncertain as to what node you’re doing the surgery on. (Or, alternately, being uncertain about certain details of the causal structure). I’m going to have to come up with some other notation to represent this.
Before we continue… do you have any objection to me making a top level posting for this (drawing out an attempt to diagram what I have in mind and so on?) frankly, even if my solution is complete nonsense, I really do think that this problem is an issue that needs to be dealt with as a larger issue.
Begun working on the diagram, still thinking out though exact way to draw it. I’ll probably have to use a crude hack of simply showing lots of surgery points and basically saying “do surgery at each of these one at a time, weighing the outcome by the probability that that’s the one you’re actually effectively operating on” (This will (hopefully) make more sense in the larger post)
Having looked at your diagram now, that’s not quite what I have in mind. For instance, “what I attempt to implement” is kinda an “innards” issue rather than deserving a separate box in this context.
Actually, I realized that what I want to do is kind of weird, sort of amounting to doing surgery on a node while being uncertain as to what node you’re doing the surgery on. (Or, alternately, being uncertain about certain details of the causal structure). I’m going to have to come up with some other notation to represent this. … I’ll probably have to use a crude hack of simply showing lots of surgery points and basically saying “do surgery at each of these one at a time, weighing the outcome by the probability that that’s the one you’re actually effectively operating on”
Not that weird, actually. I think you can do that by building a probabilistic twin network. See the good Pearl summary, slide 26. Instead of using it for a counterfactual, surgically set a different node in each subnetwork, and also the probabilities coming from the common parent (U in slide 26) to represent the probability of each subnetwork being the right one. Then use all terminal nodes across both subnetworks as the outcome set for calculating probability.
Though I guess that amounts to what you were planning anyway. Another way might be to use multiple dependent exogenous variables that capture the effect of cutting one edge when you thought you were cutting another.
Before we continue… do you have any objection to me making a top level posting for this
No problem, just make sure to link this discussion.
And I said that was more or less right, didn’t I? ie, “what I attempt to implement” ~= “innards”, which points to “selector”/”output”, which selects what actually gets used.
Looking through the second link (ie, the slides) now
Okay, I think there are some terminological issues to sort out here, resulting from our divergence from AnnaSalamon’s original terminology.
The discussion I thought we were having corresponds to the CSA’s calculation of “woulds”. And when you calculate a would, you surgically set the output of the node, which means cutting the links to its parents.
Is this where we are? Are you saying the “would” should be calculated from surgery on the “algorithm selector” node (which points to “choice of box”)? Because in that case, the links to “algorithm selector” from “algorithm space” and “innards” are cut, which d-separates them. (ETA: to clarify: d-separates “box choice” from Omega and its descendants.)
OTOH, even if you follow my suggestion and do surgery on “innards”, the connection between “box choice” and “omega’s prediction” is only a weak link—algorithm space is huge.
Perhaps you also want an arrow from “algorithm selector” to “omega’s prediction” (you don’t need a separate node for “Omega’s model of your selector” because it chains). Then, the possible difference between the box choice and omega’s prediction emerges from the independent error term pointing to box choice (which accounts for cosmic rays, hardware errors, etc.) There is a separate (implicit) “error parent” for the “Omega’s prediction” node, which accounts for shortcomings of Omega’s model.
This preserves d-connection (between box choice and box content) after a surgery on “algorithm selector”. Is that what you’re aiming for?
Yes, but like I said the first time around, this would be a rare event, rare enough to be discounted if all that Omega cares about is maximizing the chance of guessing correctly. If Omega has some other preferences over the outcomes (a “safe side” it wants to err on), and if the chance is large enough, it may have to change its choice based on this possibility.
So, here’s what I have your preferred representation as:
“Platonic space of algorithms” and “innards” both point to “selector” (the actual space of algorithms influences the selector, I assume); “innards” and “Platonic space” also together point to “Omega’s prediction”, but selector does not, because your omega can’t see the things that can cause it to err. Then, “Omega’s prediction” points to box content and selector points to your choice. Then, of course, box content and your choice point to payout.
Further, you say the choice the agent makes is at the innards node.
Is that about right?
Even if rare, the decision theory used should at least be able to THINK ABOUT THE IDEA of a hardware error or such. Even if it dismisses it as not worth considering, it should at least have some means of describing the situation. ie, I am capable of at least considering the possibility of me having brain damage or whatever. Our decision theory should be capable of no less.
Sorry if I’m unclear here, but my focus isn’t so much on omega as trying to get a version of TDT that can at least represent that sort of situation.
You seem to more or less have it right. Except I’d place the choice more at the selector or at the “node that represents the specific abstract algorithm that actually gets used”
As per TDT, choose as if you get to decide what the output for the abstract algorithm should be. The catch is that here there’s a bit of uncertainty as to which abstract algorithm is being computed. So if, due to cosmic ray striking and causing a bitflip at a certain point in the computation, you end up actually computing algorithm 1B while omega models you as being algorithm 1A, then that’d be potentially a weakening of the dependence. (Again, just using the Newcomb problem simply as a way of talking about this.)
Okay, so there’d be another node between “algorithm selector” and “your choice of box”; that would still be an uninterrupted path (chain) and so doesn’t affect the result.
The problem, then, is that if you take the agent’s choice as being at “algorithm selector”, or any descendant through “your choice of box”, you’ve d-separated “your choice of box” from “Omega’s prediction”, meaning that Omega’s prediction is conditionally independent of “your choice of box”, given the agent’s choice. (Be careful to distinguish “your choice of box” from where we’re saying the agent is making a choice.)
But then, we know that’s not true, and it would reduce your model to the “innards CSA” that AnnaSalamon gave above. (The parent of “your choice of box” has no parents.)
So I don’t think that’s an accurate representation of the situation, or consistent with TDT. So the agent’s choice must be occuring at the “innards node” in your graph.
(Note: this marks the first time I’ve drawn a causal Bayesian network and used the concept of d-separation to approach a new problem. w00t! And yes, this would be easier if I uploaded pictures as I went.)
Not sure where you’re getting that extra node from. The agent’s choice is the output of the abstract algorithm they actually end up computing as a result of all the physical processes that occur.
Abstract algorithm space feeds into both your algorithm selector node and the algorithm selector node in “omega”’s model of you. That’s where the dependence comes from.
So given logical uncertainty about the output of the algorithm, wouldn’t they be d-connected? They’d be d-separated if the choice was already known… but if it was, there’d be nothing left to choose, right? No uncertainties to be dependent on each other in the first place.
Actually, maybe I ought draw a diagram of what I have in mind and upload to imgur or whatever.
Alright, after thinking about your points some more, and refining the graph, here’s my best attempt to generate one that includes your concerns: Link.
Per AnnaSalamon’s convention, the agent’s would-node-surgery is in a square box, with the rest elliptical and the payoff octagonal. Some nodes included for clarity that would normally be left out. Dotted lines indicate edges that are cut for surgery when fixing “would” node. One link I wasn’t sure about has a ”?”, but it’s not that important.
Important points: The cutting of parents for the agent’s decision preserves d-connection between box choice and box content. Omega observes innards and attempted selection of algorithm but retains uncertainty as to how the actual algorithm plays out. Innards contribute to hardware failures to accurately implement algorithm (as do [unshown] exogenous factors).
And I do hope you follow up, given my efforts to help you spell out your point.
Just placing this here now as sort of a promise to follow up. Just that I’m running on insufficient sleep, so can only do “easy stuff” at the moment. :) I certainly plan on following up on our conversation in more detail, once I get a good night’s sleep.
Understood. Looking forward to hearing your thoughts when you’re ready :-)
Having looked at your diagram now, that’s not quite what I have in mind. For instance, “what I attempt to implement” is kinda an “innards” issue rather than deserving a separate box in this context.
Actually, I realized that what I want to do is kind of weird, sort of amounting to doing surgery on a node while being uncertain as to what node you’re doing the surgery on. (Or, alternately, being uncertain about certain details of the causal structure). I’m going to have to come up with some other notation to represent this.
Before we continue… do you have any objection to me making a top level posting for this (drawing out an attempt to diagram what I have in mind and so on?) frankly, even if my solution is complete nonsense, I really do think that this problem is an issue that needs to be dealt with as a larger issue.
Begun working on the diagram, still thinking out though exact way to draw it. I’ll probably have to use a crude hack of simply showing lots of surgery points and basically saying “do surgery at each of these one at a time, weighing the outcome by the probability that that’s the one you’re actually effectively operating on” (This will (hopefully) make more sense in the larger post)
Grr! That was my first suggestion!
Not that weird, actually. I think you can do that by building a probabilistic twin network. See the good Pearl summary, slide 26. Instead of using it for a counterfactual, surgically set a different node in each subnetwork, and also the probabilities coming from the common parent (U in slide 26) to represent the probability of each subnetwork being the right one. Then use all terminal nodes across both subnetworks as the outcome set for calculating probability.
Though I guess that amounts to what you were planning anyway. Another way might be to use multiple dependent exogenous variables that capture the effect of cutting one edge when you thought you were cutting another.
No problem, just make sure to link this discussion.
*clicks first link*
And I said that was more or less right, didn’t I? ie, “what I attempt to implement” ~= “innards”, which points to “selector”/”output”, which selects what actually gets used.
Looking through the second link (ie, the slides) now
Okay, I think there are some terminological issues to sort out here, resulting from our divergence from AnnaSalamon’s original terminology.
The discussion I thought we were having corresponds to the CSA’s calculation of “woulds”. And when you calculate a would, you surgically set the output of the node, which means cutting the links to its parents.
Is this where we are? Are you saying the “would” should be calculated from surgery on the “algorithm selector” node (which points to “choice of box”)? Because in that case, the links to “algorithm selector” from “algorithm space” and “innards” are cut, which d-separates them. (ETA: to clarify: d-separates “box choice” from Omega and its descendants.)
OTOH, even if you follow my suggestion and do surgery on “innards”, the connection between “box choice” and “omega’s prediction” is only a weak link—algorithm space is huge.
Perhaps you also want an arrow from “algorithm selector” to “omega’s prediction” (you don’t need a separate node for “Omega’s model of your selector” because it chains). Then, the possible difference between the box choice and omega’s prediction emerges from the independent error term pointing to box choice (which accounts for cosmic rays, hardware errors, etc.) There is a separate (implicit) “error parent” for the “Omega’s prediction” node, which accounts for shortcomings of Omega’s model.
This preserves d-connection (between box choice and box content) after a surgery on “algorithm selector”. Is that what you’re aiming for?
(Causal Bayes nets are kinda fun!)