Author of meaningness.com, vividness.live, and other things.
MIT AI PhD, successful biotech entrepreneur, and other things.
Author of meaningness.com, vividness.live, and other things.
MIT AI PhD, successful biotech entrepreneur, and other things.
I’m sure you know more about this than I do! Based on a quick Wiki check, I suspect that formally the A_p are one type of hyperprior, but not all hyperpriors are A_p (a/k/a metaprobabilities).
Hyperparameters are used in Bayesian sensitivity analysis, a/k/a “Robust Bayesian Analysis”, which I recently accidentally reinvented here. I might write more about that later in this sequence.
It may be helpful to read some related posts (linked by lukeprog in a comment on this post): Estimate stability, and Model Stability in Intervention Assessment, which comments on Why We Can’t Take Expected Value Estimates Literally (Even When They’re Unbiased). The first of those motivates the A_p (meta-probability) approach, the second uses it, and the third explains intuitively why it’s important in practice.
Jeremy, I think the apparent disagreement here is due to unclarity about what the point of my argument was. The point was not that this situation can’t be analyzed with decision theory; it certainly can, and I did so. The point is that different decisions have to be made in two situations where the probabilities are the same.
Your discussion seems to equate “probability” with “utility”, and the whole point of the example is that, in this case, they are not the same.
Thanks, Jonathan, yes, that’s how I understand it.
Jaynes’ discussion motivates A_p as an efficiency hack that allows you to save memory by forgetting some details. That’s cool, although not the point I’m trying to make here.
Luke, thank you for these pointers! I’ve read some of them, and have the rest open in tabs to read soon.
Jeremy, thank you for this. To be clear, I wasn’t suggesting that meta-probability is the solution. It’s a solution. I chose it because I plan to use this framework in later articles, where it will (I hope) be particularly illuminating.
I would take issue with the first section of this article in which you establish single probability (expected utility) calculations as insufficient for the problem.
I don’t think it’s correct to equate probability with expected utility, as you seem to do here. The probability of a payout is the same in the two situations. The point of this example is that the probability of a particular event does not determine the optimal strategy. Because utility is dependent on your strategy, that also differs.
This problem easily succumbs to standard expected value calculations if all actions are considered.
Yes, absolutely! I chose a particularly simple problem, in which the correct decision-theoretic analysis is obvious, in order to show that probability does not always determine optimal strategy. In this case, the optimal strategies are clear (except for the exact stopping condition), and clearly different, even though the probabilities are the same.
I’m using this as an introductory wedge example. I’ve opened a Pandora’s Box: probability by itself is not a fully adequate account of rationality. Many odd things will leap and creep out of that box so long as we leave it open.
Hi!
I’ve been interested in how to think well since early childhood. When I was about ten, I read a book about cybernetics. (This was in the Oligocene, when “cybernetics” had only recently gone extinct.) It gave simple introductions to probability theory, game theory, information theory, boolean switching logic, control theory, and neural networks. This was definitely the coolest stuff ever.
I went on to MIT, and got an undergraduate degree in math, specializing in mathematical logic and the theory of computation—fields that grew out of philosophical investigations of rationality.
Then I did a PhD at the MIT AI Lab, continuing my interest in what thinking is. My work there seems to have been turned into a surrealistic novel by Ken Wilber, a woo-ish pop philosopher. Along the way, I studied a variety of other fields that give diverse insights into thinking, ranging from developmental psychology to ethnomethodology to existential phenomenology.
I became aware of LW gradually over the past few years, mainly through mentions by people I follow on Twitter. As a lurker, there’s a lot about the LW community I’ve loved. On the other hand, I think some fundamental, generally-accepted ideas here are limited and misleading. I began considering writing about that recently, and posted some musings about whether and how it might be useful to address these misconceptions. (This was perhaps ruder than it ought to have been.) It prompted a reply post from Yvain, and much discussion on both his site and mine.
I followed that up with a more constructive post on aspects of how to think well that LW generally overlooks. In comments on that post, several frequent LW contributors encouraged me to re-post that material here. I may yet do that!
For now, though, I’ve started a sequence of LW articles on the difference between uncertainty and probability. Missing this distinction seems to underlie many of the ways I find LW thinking limited. Currently my outline for the sequence has seven articles, covering technical explanations of this difference, with various illustrations; the consequences of overlooking the distinction; and ways of dealing with uncertainty when probability theory is unhelpful.
(Kaj Sotala has suggested that I ask for upvotes on this self-introduction, so I can accumulate enough karma to move the articles from Discussion to Main. I wouldn’t have thought to ask that myself, but he seems to know what he’s doing here! :-)
O&BTW, I also write about contemporary trends in Buddhism, on several web sites, including a serial, philosophical, tantric Buddhist vampire romance novel.
Can you recommend an explanation of the complete class theorem(s)? Preferably online. I’ve been googling pretty hard and I’ve turned up almost nothing. I’d like to understand what conditions they start from (suspecting that maybe the result is not quite as strong as “Bayes Rules!”). I’ve found only one paper, which basically said “what Wald proved is extremely difficult to understand, and probably not what you wanted.”
Thank you very much!
A collection of collections of advice for graduate students! http://vlsicad.ucsd.edu/Research/Advice/
A collection of advice for graduate students I put together some time ago: http://www.cs.indiana.edu/mit.research.how.to.html
It was meant specifically for people at the MIT AI Lab, but much of it is applicable to other STEM fields.
Regarding the development of agreeableness/empathy: there are meditation techniques specifically intended to do this. (They are variously called “Metta”, “Lojong”, “Tonglen”, or (yuck) “loving kindness meditation”; all of which are pretty similar.) These originate in Mahayana Buddhism, but don’t have any specifically religious content. They are often taught in conjunction with mindfulness meditation.
I don’t know whether there have been any serious studies on these methods, but anecdotally they are highly effective. They seem not only to develop empathy, but also personal happiness (although that is not a stated goal). Generally, the serious studies that have been done on different meditation techniques have found that they work as advertised...
Yes, I see your point (although I don’t altogether agree). But, again, what I’m doing here is setting up analytical apparatus that will be helpful for more difficult cases later.
In the mean time, the LW posts I pointed to here may motivate more strongly the claim that probability alone is an insufficient guide to action.