First we know which part of the causal model correspond to the human, which is not the case in the NN
This doesn’t follow only from [we know X is an LCDT agent that’s modeling a human] though, right? We could imagine some predicate/constraint/invariant that detects/enforces/maintains LCDTness without necessarily being transparent to humans. I’ll grant you it seems likely so long as we have the right kind of LCDT agent—but it’s not clear to me that LCDTness itself is contributing much here.
The human will be modeled only by variables on this part of the causal graph, whereas it could be completely distributed over a NN
At first sight this seems at least mostly right—but I do need to think about it more. E.g. it seems plausible that most of the work of modeling a particular human H fairly accurately is in modeling [humans-in-general] and then feeding H’s properties into that. The [humans-in-general] part may still be distributed. I agree that this is helpful. However, I do think it’s important not to assume things are so nicely spatially organised as they would be once you got down to a molecular level model.
a causal model seems to give way more information than a NN, because it encodes the causal relationship, whereas a NN could completely compute causal relationships in a weird and counterintuitive way
My intuitions are in the same direction as yours (I’m playing devil’s advocate a bit here—shockingly :)). I just don’t have principled reasons to think it actually ends up more informative.
I imagine learned causal models can be counter-intuitive too, and I think I’d expect this by default. I agree that it seems much cleaner so long as it’s using a nice ontology with nice abstractions… - but is that likely? Would you guess it’s easier to get the causal model to do things in a ‘nice’, ‘natural’ way than it would be for an NN? Quite possibly it would be.
Me too!
This doesn’t follow only from [we know X is an LCDT agent that’s modeling a human] though, right? We could imagine some predicate/constraint/invariant that detects/enforces/maintains LCDTness without necessarily being transparent to humans.
I’ll grant you it seems likely so long as we have the right kind of LCDT agent—but it’s not clear to me that LCDTness itself is contributing much here.
At first sight this seems at least mostly right—but I do need to think about it more. E.g. it seems plausible that most of the work of modeling a particular human H fairly accurately is in modeling [humans-in-general] and then feeding H’s properties into that. The [humans-in-general] part may still be distributed.
I agree that this is helpful. However, I do think it’s important not to assume things are so nicely spatially organised as they would be once you got down to a molecular level model.
My intuitions are in the same direction as yours (I’m playing devil’s advocate a bit here—shockingly :)). I just don’t have principled reasons to think it actually ends up more informative.
I imagine learned causal models can be counter-intuitive too, and I think I’d expect this by default. I agree that it seems much cleaner so long as it’s using a nice ontology with nice abstractions… - but is that likely? Would you guess it’s easier to get the causal model to do things in a ‘nice’, ‘natural’ way than it would be for an NN? Quite possibly it would be.