Thomas Kwa comments on How We Picture Bayesian Agents

Thomas Kwa 9 Apr 2024 17:53 UTC
5 points
0
I don’t need to calculate all that, in order to make an expected-utility-maximizing lunch order. I just need to calculate the difference between the utility which I expect if I order lamb Karahi vs a sisig burrito.
… and since my expectations for most of the world are the same under those two options, I should be able to calculate the difference lazily, without having to query most of my world model. Much like the message-passing update, I expect deltas to quickly fall off to zero as things propagate through the model.
This is an exciting observation. I wonder if you could empirically demonstrate that this works in a model based RL setup, on a videogame or something?