whales
Exposition and guidance by analogy
Unfair outcomes from fair tests
Thinking on the page
I’ve started cleaning up and posting some old drafts on my blog. I’ve drifted away, but some of them may be of interest to people still here. Most directly up this alley so far would be this post recommending people read Trial By Mathematics.
I like this post. I lean towards skepticism about the usefulness of calibration or even accuracy, but I’m glad to find myself mostly in agreement here.
For lots of practical (to me) situations, a little bit of uncertainty goes a long way concerning how I actually decide what to do. It doesn’t really matter how much uncertainty, or how well I can estimate the uncertainty. It’s better for me to just be generally humble and make contingency plans. It’s also easy to imagine that being well-calibrated (or knowing that you are) could actually demolish biases that are actually protective against bad outcomes, if you’re not careful. If you are careful, sure, there are possible benefits, but they seem modest.
But making and testing predictions seems more than modestly useful, whether or not you get better (or better calibrated) over time. I find I learn better (testing effect!) and I’m more likely to notice surprising things. And it’s an easy way to lampshade certain thoughts/decisions so that I put more effort into them. Basically, this:
Or in other other words: the biggest problem with your predictions right now is that they don’t exist.
To be more concrete, a while back I actually ran a self-experiment on quantitative calibration for time-tracking/planning (your point #1). The idea was to get a baseline by making and resolving predictions without any feedback for a few weeks (i.e. I didn’t know how well I was doing—I also made predictions in batches so I usually couldn’t remember them and thus target my prediction “deadlines”). Then I’d start looking at calibration curves and so on to see if feedback might improve predictions (in general or in particular domains). It turned out after the first stage that I was already well-calibrated enough that I wouldn’t be able to measure any interesting changes without an impractical number of predictions, but while it lasted I got a moderate boost in productivity just from knowing I had a clock ticking, plus more effective planning from the way predictions forced me to think about contingencies. (I stopped the experiment because it was tedious, but I upped the frequency of predictions I make habitually.)
If I can introduce a problem domain that doesn’t get a lot of play in these communities but (I think) should:
End-of-life healthcare in the US seems like a huge problem (in terms of cost, honored preferences, and quality of life for many people) that’s relatively tractable for its size. The balance probably falls in favor of making things happen rather than researching technical questions, but I’m hoping it still belongs here.
There’s a recent IOM report that covers the presently bleak state of affairs and potential ways forward pretty thoroughly. One major problem is that doctors don’t know their patients’ care preferences, resulting in a bias towards acute care over palliative care, which in turn leads to unpleasant (and expensive) final years. There are a lot of different levers in actual care practices, advanced care planning, professional education/development, insurance policies, and public education. I might start with the key findings and recommendations (PDF) and think about where to go from there. There’s also Atul Gawande’s recent book Being Mortal, which I’ve yet to read but people seem excited about. Maybe look at what organizations like MyDirectives and Goals of Care are doing.
This domain probably has a relative advantage in belief- or value-alignment for people who think widely available anti-aging is far in the future or undesirable, although I’m tempted to argue that in a world with normalized life extension, the norms surrounding end-of-life care become even more important. The problem might also be unusually salient from some utilitarian perspectives. And while I’ve never been sure what civilizational inadequacy means, people interested in it might be easier to sell on fixing end-of-life care.
You can predict how long tasks/projects will take you (stopwatch and/or calendar time). Even if calibration doesn’t generalize, it’s potentially useful on its own there. And while you can’t quite mass-produce questions/predictions, it’s not such a hassle to rack up a lot if you do them in batches. Malcolm Ocean wrote about doing this with a spreadsheet, and I threw together an Android todo-with-predictions app for a similar self experiment.
I measured science and technology output per scientist using four different lists of significant advances, and found that significant advances per scientist declined by 3 to 4 orders of magnitude from 1800 to 2000. During that time, the number of scientific journals has increased by 3 to 4 orders of magnitude, and a reasonable guess is that so did the number of scientists.
I’d be really interested in reading more about this.
Yeah, that happened when I edited a different part from my phone. Thanks, fixed.
See this tumblr post for an example of Ozy expressing dissatisfaction with Scott’s lack of charity in his analysis of SJ (specifically in the “Words, Words, Words” post). My impression is that this is a fairly regular occurrence.
You might be right about him not having updated. If anything it seems that his updates on the earlier superweapons discussion have been reverted. I’m not sure I’ve seen anything comparably charitable from him on the subject since. I don’t follow his thoughts on feminism particularly closely, so I could easily be wrong (and would be glad to find I’m wrong here).
I wrote down a handful as I was doing this, but not all of them. There were a couple about navigation (where rather than say “well, I don’t know where I am, I’ll just trust the group” I figured out how I was confused about different positions of landmarks). I avoided overbaking my cookies when the recipe had the wrong time written down. Analytics for a site I run pointed to a recent change causing problems for some people, and I saw the (slight) pattern right away but ignored it until it got caught on my confusion hook. It’s also a nice hook for asking questions in casual conversations. People are happy to explain why they like author X but not the superficially similar author Y I’ve heard them complain about before, for example.
Thanks, I’m glad you liked it!
Did someone link this recently? It seems to have gotten a new burst of votes.
There are concept inventories in a lot of fields, but these vary in quality and usefulness. The most well-known of these is the Force Concept Inventory for first semester mechanics, which basically aims to test how Aristotelian/Newtonian a student’s thinking is. Any physicist can point out a dozen problems with it, but it seems to very roughly measure what it claims to measure.
Russ Roberts (host of the podcast EconTalk) likes to talk about the “economic way of thinking” and has written and gathered links about ten key ideas like incentives, markets, externalities, etc. But he’s relatively libertarian, so the ideas he chose and his exposition will probably not provide a very complete picture. Anyway, EconTalk has started asking discussion questions after each podcast, some of which aim to test basic understanding along these lines.
If anyone has already posted any similar posts, then I would really appreciate any links.
Off the top of my head, Swimmer963 wrote about her experiences trying meditation, and I wrote about trying to notice confusion better. Gwern has run more serious self-experiments, and he talks about a bunch of them in the context of value of information here.
I find this idea (or a close relative) a useful guide for resolving a heuristic explanation or judgment into a detailed, causal explanation or consequentialist judgment. If someone draws me a engine cycle that creates infinite work out of finite heat (Question 5), I can say it violates the laws of thermodynamics. Of course their engine really is impossible. But there’s still confusion: our explanations remain in tension because something’s left unexplained. To fully resolve this confusion, I have to look in detail at their engine cycle, and find the error that allows the violation.
Principled explanations, especially about human behavior or society, tend to come into tension in a similar way. That tension can similarly point the way to detailed, causal explanations that will dissolve the question. For example, you say that an idea meeting a counter-idea may well fail to generate facts, which is contrary to your understanding of dialectics. It’s not very useful to merely state these ideas in opposition to each other, but there’s something to be learned by looking at where they conflict and why.
So in this case, where you doubt that this process generates facts, consider how it might or might not reliably do so. One way it could do so is if there were a recipe for turning the conflict into an opportunity for learning, like “look for detailed causal mechanisms where the two big ideas directly conflict.” One way it might fail is if people who held each one of the two ideas entrenched themselves as opposed to the other, and everyone continued to simply talk past one another without attempting to understand. Now you’ve refined your heuristic so you can better judge how well this will work in individual cases, and you can iterate.
I think of the moral version of this as a generalization of the argument from marginal cases against giving moral standing to humans alone (i.e. that there’s no value-relevant principle that selects all and only humans). The generalization is to come at this from both sides of a debate, and say that you can expect any principled judgment to fail on marginal cases. The content of your principle is in large part how it treats those marginal cases. From this perspective, you study the marginal cases to improve your understanding of your values, rather than try to use heuristics to decide the marginal cases. (Sometimes this perspective is useful, and sometimes it’s not. Hmm, why is that?)
Yes, that’s a good example, thanks.
I’ve collected some quotes from Beyond Discovery, a series of articles commissioned by the National Academy of Sciences from 1997 to 2003 on paths from basic research to useful technology. My comments there:
The articles (each around 8 pages) are roughly popular-magazine-level accounts of variable quality, but I learned quite a bit from all of them, particularly from the biology and medicine articles. They’re very well written, generally with input from the relevant scientists still living (many of them Nobel laureates). In particular I like the broad view of history, the acknowledged scope of the many branches leading to any particular technology, the variety of topics outside the usual suspects, the focus on fairly recent technology, and the emphasis bordering on propagandist on the importance and unpredictability of basic research. It seems to me that they filled an important gap in popular science writing in this way.
I’m interested in histories of science that are nonstandard in those and other ways (for example, those with an unusual focus on failures or dead ends), and I’m slowly collecting some additional notes and links at the bottom of that page. Do you have any recommendations? Or other comments?
Avpr. V’q abgr gung gur ebyr bs “srzvavfgf” va guvf zlgu-znxvat vf fbzrjung nanybtbhf (gubhtu boivbhfyl abg cresrpgyl fb) gb gur ebyr bs “gur zrqvpny rfgnoyvfuzrag” va cebzhytngvat gur vqrn bs ovplpyr snpr va gur svefg cynpr.
Theory also influences what data you consider in the first place. (Are you looking at your own local weather, global surface temperatures, stratospheric temperatures, ocean temperatures, extreme weather events, Martian climate, polar ice, or the beliefs and behavior of climatologists, and over what time scales and eras?) See also philosophy of science since at least Kuhn on theory-laden observation: http://plato.stanford.edu/entries/science-theory-observation/
I appreciate this perspective! My first instinct is to zoom out from stock phrases to entire ideas or arguments while drafting (when everything is working well, sentences or paragraphs get translated atomically like this), then use ‘close reading’ as an editing tactic. But you’re right that zooming in to find the exact word when stuck on the page can also be very focusing (as it were). And there’s a lot of room for interplay between the two approaches, as far as there’s even a clean separation between self-expression and self-editing in the first place.