gwern comments on gwern’s Shortform

gwern Apr 24, 2021, 9:39 PM
19 points
Normalization-free Bayes: I was musing on Twitter about what the simplest possible still-correct computable demonstration of Bayesian inference is, that even a middle-schooler could implement & understand. My best candidate so far is ABC Bayesian inference*: simulation + rejection, along with the ‘possible worlds’ interpretation.

Someone noted that rejection sampling is simple but needs normalization steps, which adds complexity back. I recalled that somewhere on LW many years ago someone had a comment about a Bayesian interpretation where you don’t need to renormalize after every likelihood computation, and every hypothesis just decreases at different rates; as strange as it sounds, it’s apparently formally equivalent. I thought it was by Wei Dai, but I can’t seem to refind it because queries like ‘Wei Dai Bayesian decrease’ obviously pull up way too many hits, it’s probably buried in an Open Thread somewhere, my Twitter didn’t help, and Wei Dai didn’t recall it at all when I asked him. Does anyone remember this?

* I’ve made a point of using ABC in some analyses simply because it amuses me that something so simple still works, even when I’m sure I could’ve found a much faster MCMC or VI solution with some more work.

Incidentally, I’m wondering if the ABC simplification can be taken further to cover subjective Bayesian decision theory as well: if you have sets of possible worlds/hypotheses, let’s say discrete for convenience, and you do only penalty updates as rejection sampling of worlds that don’t match the current observation (like AIXI), can you then implement decision theory normally by defining a loss function and maximizing over it? In which case you can get Bayesian decision theory without probabilities, calculus, MCM, VI, conjugacy formulas falling from heaven, etc or anything more complicated than a list of numbers and a few computational primitives like coinflip() (and then a ton of computing power to brute force the exact ABC/rejection sampling).
- Wei Dai Apr 25, 2021, 12:43 AM
  4 points
  Parent
  Doing another search, it seems I made at least one comment that is somewhat relevant, although it might not be what you’re thinking of: https://www.greaterwrong.com/posts/5bd75cc58225bf06703751b2/in-memoryless-cartesian-environments-every-udt-policy-is-a-cdt-sia-policy/comment/kuY5LagQKgnuPTPYZ
- mattmacdermott Jan 17, 2025, 9:25 AM
  2 points
  0
  Parent
  
  a Bayesian interpretation where you don’t need to renormalize after every likelihood computation
  
  How does this differ from using Bayes’ rule in odds ratio form? In that case you only ever have to renormalise if at some point you want to convert to probabilities.
- eigen Apr 25, 2021, 12:41 AM
  1 point
  Parent
  Funny that you have your great LessWrong whale as I do, and that you recall that it may be from Wei Dai as well (while him not recalling)
  https://www.lesswrong.com/posts/X4nYiTLGxAkR2KLAP/?commentId=nS9vvTiDLZYow2KSK