Normalization-free Bayes: I was musing on Twitter about what the simplest possible still-correct
computable demonstration of Bayesian inference is, that even a
middle-schooler could implement & understand. My best candidate so far
is ABC Bayesian inference*: simulation + rejection, along with the
‘possible worlds’ interpretation.
Someone noted that rejection sampling is simple but needs
normalization steps, which adds complexity back. I recalled that
somewhere on LW many years ago someone had a comment about a Bayesian
interpretation where you don’t need to renormalize after every
likelihood computation, and every hypothesis just decreases at
different rates; as strange as it sounds, it’s apparently formally equivalent. I thought it was by Wei Dai, but I can’t seem to refind
it because queries like ‘Wei Dai Bayesian decrease’ obviously pull up
way too many hits, it’s probably buried in an Open Thread
somewhere, my Twitter didn’t help, and Wei Dai didn’t recall it at all when I asked him. Does anyone remember this?
* I’ve made a point of using ABC in some analyses simply because it amuses me that something so simple still works, even when I’m sure I could’ve found a much faster MCMC or VI solution with some more work.
Incidentally, I’m wondering if the ABC simplification can be taken further to cover subjective Bayesian decision theory as well: if you have sets of possible
worlds/hypotheses, let’s say discrete for convenience, and you do only
penalty updates as rejection sampling of worlds that don’t match the
current observation (like AIXI), can you then implement decision
theory normally by defining a loss function and maximizing over it? In
which case you can get Bayesian decision theory without probabilities,
calculus, MCM, VI, conjugacy formulas falling from heaven, etc or anything more complicated than a list of
numbers and a few computational primitives like coinflip() (and then a ton of computing power to brute force the exact ABC/rejection sampling).
Normalization-free Bayes: I was musing on Twitter about what the simplest possible still-correct computable demonstration of Bayesian inference is, that even a middle-schooler could implement & understand. My best candidate so far is ABC Bayesian inference*: simulation + rejection, along with the ‘possible worlds’ interpretation.
Someone noted that rejection sampling is simple but needs normalization steps, which adds complexity back. I recalled that somewhere on LW many years ago someone had a comment about a Bayesian interpretation where you don’t need to renormalize after every likelihood computation, and every hypothesis just decreases at different rates; as strange as it sounds, it’s apparently formally equivalent. I thought it was by Wei Dai, but I can’t seem to refind it because queries like ‘Wei Dai Bayesian decrease’ obviously pull up way too many hits, it’s probably buried in an Open Thread somewhere, my Twitter didn’t help, and Wei Dai didn’t recall it at all when I asked him. Does anyone remember this?
* I’ve made a point of using ABC in some analyses simply because it amuses me that something so simple still works, even when I’m sure I could’ve found a much faster MCMC or VI solution with some more work.
Incidentally, I’m wondering if the ABC simplification can be taken further to cover subjective Bayesian decision theory as well: if you have sets of possible worlds/hypotheses, let’s say discrete for convenience, and you do only penalty updates as rejection sampling of worlds that don’t match the current observation (like AIXI), can you then implement decision theory normally by defining a loss function and maximizing over it? In which case you can get Bayesian decision theory without probabilities, calculus, MCM, VI, conjugacy formulas falling from heaven, etc or anything more complicated than a list of numbers and a few computational primitives like
coinflip()
(and then a ton of computing power to brute force the exact ABC/rejection sampling).Doing another search, it seems I made at least one comment that is somewhat relevant, although it might not be what you’re thinking of: https://www.greaterwrong.com/posts/5bd75cc58225bf06703751b2/in-memoryless-cartesian-environments-every-udt-policy-is-a-cdt-sia-policy/comment/kuY5LagQKgnuPTPYZ
Funny that you have your great LessWrong whale as I do, and that you recall that it may be from Wei Dai as well (while him not recalling)
https://www.lesswrong.com/posts/X4nYiTLGxAkR2KLAP/?commentId=nS9vvTiDLZYow2KSK