You are talking about a “verifier for explanations”. I don’t know how an explanation could be verified under constructivist epistemology and pragmatist meta-epistemology.
I’ve recently thought about the relationship between GFlowNets and constructivism. Here’re some excerpts, unedited, but hopefully could be helpful in some way to someone.
It’s interesting that GFlowNets suggest constructing a trajectory, i. e., an explanation, rather than sampling it via Markov Chain Monte Carlo (MCMC) methods (e. g. Monte-Carlo Tree Search), as suggested in the current ActInf-based AI architectures (Fountas et al. 2020). This can either be explanation for oneself, justifying the most proximate action to take (as in Active Inference: an agent creates a plan, then takes a first step (action) from it, and then create a new plan), or explanation for others, explaining some actions that have already been made (P_B(\tau|x) in GFlowNets).
So, it seems that GFlowNets fit Deutsch’s account of creative explanations rather well. Deutsch (together with Pearl) contrapose constructed, subjective, individualised explanations to emiricism, Bayesianism (in the sense of associational reasoning, “rung one of the causality ladder”, per Pearl) and “counting averages and population stats”, which seems to correspond MCMC methods. Bengio also explains why GFlowNets are statistically superior to variational (Bayesian) inference methods (though I don’t understand what does he mean by “mode-following”, “mean-following”, and “high variance gradients”):
The typical variational inference objective (the ELBO or reverse-KL) leads to mode-following (focussing on one mode) and the forward KL leads to mean-following (overly conservative, sampling too broadly) and annoying variance when implemented with importance sampling. Instead the off-policy GFlowNet objectives (e.g., with a tempered version of P_F as training policy) seem to strike a different balance and tend to recover more of the modes without the down-side of the forward-KL variational inference variants (mean-following and high variance gradients).
GFlowNets don’t assume a static causal graph, but stochastically construct one from Bayesian posterior (given the past evidence) in the space of all possible causal graphs.
GFlowNet seem to more relate to language generation and consciousness contents: humans seem to generate these “randomly”, indeed. Humans come up with “stochastic” justifications for past events and actions when they are not stabilised in their heads in reference frames.
Another thing that Bengio suggests, “hypergraph sampling”, doesn’t feel neurobiologically plausible (or not?), but it’s not clear whether Bengio suggests it as a neurobiological explanation or as architecture for intelligence (in which case the fact that human’s causal graphs are simple graphs, rather than hypergraphs, is our inductive prior).
Extending Bengio’s idea of sampling from the Bayesian posterior over the space of causal graphs in linguistic explanations, constructing explanatory theories in service of constructing an action trajectory to achieve a certain goal (the pragmatic stance) from the Bayesian posterior over the space of all possible theories (not only causal graphs, but also any other sorts of theories, from formal theories written as closed-form equations, attached to executable diagrams, to as unreliable “theories” as induction rules, heuristics, and intuitions) corresponds to epistemological pluralism, which, John Krakauer thinks, is also how the brain actually works: “There is pluralism in how the nervous system sees the control problem and the representations it uses, and that is the ontological truth of pluralism, and there is a mapping onto epistemological pluralism. This may well be the reason why we have psychology and neuroscience, and we have psychiatry and neurology.”
The selection of the explanatory stance (the perspective, the level of emergence) would be one of the core steps in constructing an explanatory theory for a pragmatic purpose. For instance, we can call either a psychiatrist or a neurologist if we have a goal of diagnosing and then curing the illness in the patient. So, there couldn’t be one “right” perspective on any object. Instead, any intelligent agent always chooses the perspective most suitable (that is, minimising the expected free energy) for reaching a particular goal. There is also a normative imperative to improve the quality of these choices (which can amount to training in selecting from a set of coherent theories which already exist as well as attempts to create and criticise new coherent explanatory theories).
To clarify: by a “verifier for explanations” I mostly mean something like a heuristic estimator as introduced in Formalizing the Presumption of Independence (or else something even further from formality that would fill a similar role).
You are talking about a “verifier for explanations”. I don’t know how an explanation could be verified under constructivist epistemology and pragmatist meta-epistemology.
I’ve recently thought about the relationship between GFlowNets and constructivism. Here’re some excerpts, unedited, but hopefully could be helpful in some way to someone.
It’s interesting that GFlowNets suggest constructing a trajectory, i. e., an explanation, rather than sampling it via Markov Chain Monte Carlo (MCMC) methods (e. g. Monte-Carlo Tree Search), as suggested in the current ActInf-based AI architectures (
Fountas et al. 2020
). This can either be explanation for oneself, justifying the most proximate action to take (as in Active Inference: an agent creates a plan, then takes a first step (action) from it, and then create a new plan), or explanation for others, explaining some actions that have already been made (P_B(\tau|x) in GFlowNets).So, it seems that GFlowNets fit Deutsch’s account of creative explanations rather well. Deutsch (together with Pearl) contrapose constructed, subjective, individualised explanations to emiricism, Bayesianism (in the sense of associational reasoning, “rung one of the causality ladder”, per Pearl) and “counting averages and population stats”, which seems to correspond MCMC methods. Bengio also explains why GFlowNets are statistically superior to variational (Bayesian) inference methods (though I don’t understand what does he mean by “mode-following”, “mean-following”, and “high variance gradients”):
With regard to this idea of “construction of explanations”, it’s teasing to try to find some parallels with
Core Constructive Ontology
,Baez and Stay’s ideas about system construction
, and Deutsch and Marletto’sConstructor Theory
, and together with GFlowNet call these trends in epistemology, ontology, and philosophy of language “the constructive turn”, by analogy with thepragmatic turn
in philosophy.GFlowNets don’t assume a static causal graph, but stochastically construct one from Bayesian posterior (given the past evidence) in the space of all possible causal graphs.
GFlowNet seem to more relate to language generation and consciousness contents: humans seem to generate these “randomly”, indeed. Humans come up with “stochastic” justifications for past events and actions when they are not stabilised in their heads in reference frames.
Another thing that Bengio suggests, “hypergraph sampling”, doesn’t feel neurobiologically plausible (or not?), but it’s not clear whether Bengio suggests it as a neurobiological explanation or as architecture for intelligence (in which case the fact that human’s causal graphs are simple graphs, rather than hypergraphs, is our inductive prior).
Extending Bengio’s idea of sampling from the Bayesian posterior over the space of causal graphs in linguistic explanations, constructing explanatory theories in service of constructing an action trajectory to achieve a certain goal (the pragmatic stance) from the Bayesian posterior over the space of all possible theories (not only causal graphs, but also any other sorts of theories, from formal theories written as closed-form equations, attached to
executable diagrams
, to as unreliable “theories” as induction rules, heuristics, and intuitions) corresponds to epistemological pluralism, which,John Krakauer thinks, is also how the brain actually works
: “There is pluralism in how the nervous system sees the control problem and the representations it uses, and that is the ontological truth of pluralism, and there is a mapping onto epistemological pluralism. This may well be the reason why we have psychology and neuroscience, and we have psychiatry and neurology.”The selection of the explanatory stance (the perspective, the level of emergence) would be one of the core steps in constructing an explanatory theory for a pragmatic purpose. For instance, we can call either a psychiatrist or a neurologist if we have a goal of diagnosing and then curing the illness in the patient. So, there couldn’t be one “right” perspective on any object. Instead, any intelligent agent always chooses the perspective most suitable (that is, minimising the expected free energy) for reaching a particular goal. There is also a normative imperative to improve the quality of these choices (which can amount to training in selecting from a set of coherent theories which already exist as well as attempts to create and criticise new coherent explanatory theories).
To clarify: by a “verifier for explanations” I mostly mean something like a heuristic estimator as introduced in Formalizing the Presumption of Independence (or else something even further from formality that would fill a similar role).