This remains the best overview of the learning-theoretic agenda to-date. As a complementary pedagogic resource, there is now also a series of video lectures.
Since the article was written, there were several new publications:
An article on infra-Bayesian haggling by my MATS scholar Hanna Gabor. This approach to multi-agent systems did not exist when the overview was written, and currently seems like the most promising direction.
An article on time complexity in string machines by my MATS scholar Ali Cataltepe. This is a rather elegant method to account for time complexity in the formalism.
In addition, some new developments were briefly summarized in short-forms:
A proposed solution for the monotonicity problem in infra-Bayesian physicalism. This is potentially very important since the monotonicity problem was by far the biggest issue with the framework (and as a consequence, with PSI).
Multiple developments concerning metacognitive agents (see also recorded talk). This framework seems increasingly important, but an in-depth analysis is still pending.
A conjecture about a possible axiomatic characterization of the maximin decision rule in infra-Bayesianism. If true, it would go a long way to allaying any concerns about whether maximin is the “correct” choice.
Ambidistributions: a cute new mathematical gadget for formalizing the notion of “control” in infra-Bayesianism.
Meanwhile, active research proceeds along several parallel directions:
I’m working towards the realization of the “frugal compositional languages” dream. So far, the problem is still very much open, but I obtained some interesting preliminary results which will appear in an upcoming paper (codename: “ambiguous online learning”). I also realized this direction might have tight connections with categorical systems theory (the latter being a mathematical language for compositionality). An unpublished draft was written by my MATS scholars on the subject of compositional polytope MDPs, hopefully to be completed some time during ’25.
Diffractor achieved substantial progress in the theory of infra-Bayesian regret bounds, producing an infra-Bayesian generalization of decision-estimation coefficients (the latter is a nearly universal theory of regret bounds in episodic RL). This generalization has important connections to Garrabrant induction (of the flavor studied here), finally sketching a unified picture of these two approaches to “computational uncertainty” (Garrabrant induction and infra-Bayesianism). Results will appear in upcoming paper.
Gergely Szucs is studying the theory of hidden rewards, starting from the realization in this short-form (discovering some interesting combinatorial objects beyond what was described there).
It remains true that there are more shovel-ready open problems than researchers, and hence the number of (competent) researchers is still the bottleneck.
This remains the best overview of the learning-theoretic agenda to-date. As a complementary pedagogic resource, there is now also a series of video lectures.
Since the article was written, there were several new publications:
My paper on linear infra-Bayesian bandits.
An article on infra-Bayesian haggling by my MATS scholar Hanna Gabor. This approach to multi-agent systems did not exist when the overview was written, and currently seems like the most promising direction.
An article on time complexity in string machines by my MATS scholar Ali Cataltepe. This is a rather elegant method to account for time complexity in the formalism.
In addition, some new developments were briefly summarized in short-forms:
A proposed solution for the monotonicity problem in infra-Bayesian physicalism. This is potentially very important since the monotonicity problem was by far the biggest issue with the framework (and as a consequence, with PSI).
Multiple developments concerning metacognitive agents (see also recorded talk). This framework seems increasingly important, but an in-depth analysis is still pending.
A conjecture about a possible axiomatic characterization of the maximin decision rule in infra-Bayesianism. If true, it would go a long way to allaying any concerns about whether maximin is the “correct” choice.
Ambidistributions: a cute new mathematical gadget for formalizing the notion of “control” in infra-Bayesianism.
Meanwhile, active research proceeds along several parallel directions:
I’m working towards the realization of the “frugal compositional languages” dream. So far, the problem is still very much open, but I obtained some interesting preliminary results which will appear in an upcoming paper (codename: “ambiguous online learning”). I also realized this direction might have tight connections with categorical systems theory (the latter being a mathematical language for compositionality). An unpublished draft was written by my MATS scholars on the subject of compositional polytope MDPs, hopefully to be completed some time during ’25.
Diffractor achieved substantial progress in the theory of infra-Bayesian regret bounds, producing an infra-Bayesian generalization of decision-estimation coefficients (the latter is a nearly universal theory of regret bounds in episodic RL). This generalization has important connections to Garrabrant induction (of the flavor studied here), finally sketching a unified picture of these two approaches to “computational uncertainty” (Garrabrant induction and infra-Bayesianism). Results will appear in upcoming paper.
Gergely Szucs is studying the theory of hidden rewards, starting from the realization in this short-form (discovering some interesting combinatorial objects beyond what was described there).
It remains true that there are more shovel-ready open problems than researchers, and hence the number of (competent) researchers is still the bottleneck.