Really nice post. One thing I’m curious about is this line:
This provides some intuitions about what sort of predictor you’d need to get a non-delusional agent—for instance, it should be possible if you simulate the agent’s entire boundary.
I don’t see the connection here? Haven’t read the paper though.
Thanks! Yeah this isn’t in the paper, it’s just a thing I’m fairly sure of which probably deserves a more thorough treatment elsewhere. In the meantime, some rough intuitions would be:
delusions are a result of causal confounders, which must be hidden upstream variables
if you actually simulate and therefore specify an entire markov blanket, it will screen off all other upstream variables including all possible confounders
this is ludicrously difficult for agents with a long history (like a human), but if the STF story is correct, it’s sufficient, and crucially, you don’t even need to know the full causal structure of reality, just a complete markov blanket
any holes in the markov blanket/boundary represent ways for unintended causal pathways to leak through, which separate the predictor’s predictions about the effect of an action from the actual causal effect of the action, making the agent appear ‘delusional’
I hope we’ll have a proper writeup soon; in the meantime let me know if this doesn’t make sense.
Really nice post. One thing I’m curious about is this line:
I don’t see the connection here? Haven’t read the paper though.
Thanks! Yeah this isn’t in the paper, it’s just a thing I’m fairly sure of which probably deserves a more thorough treatment elsewhere. In the meantime, some rough intuitions would be:
delusions are a result of causal confounders, which must be hidden upstream variables
if you actually simulate and therefore specify an entire markov blanket, it will screen off all other upstream variables including all possible confounders
this is ludicrously difficult for agents with a long history (like a human), but if the STF story is correct, it’s sufficient, and crucially, you don’t even need to know the full causal structure of reality, just a complete markov blanket
any holes in the markov blanket/boundary represent ways for unintended causal pathways to leak through, which separate the predictor’s predictions about the effect of an action from the actual causal effect of the action, making the agent appear ‘delusional’
I hope we’ll have a proper writeup soon; in the meantime let me know if this doesn’t make sense.