It seems like a lot of focus on MIRI giving good signals to outsiders. The “publish or perish” treadmill of academia is exactly why privately funded organizations like MIRI are needed.
The things that su3su2u1 wants MIRI to be already exist in academia. The whole point of MIRI is to create an organization of a type that doesn’t currently exist, focused on much longer term goals. If you measure organizations on the basis of how many publications they make, you’re going to get a lot of low-quality publications. Citations are only slightly better, especially if you’re focused on ignored areas of research.
If you have outside-view criticisms of an organization and you’re suddenly put in charge of them, the first thing you have to do is check the new inside-view information available and see what’s really going on.
Ever since I started hanging out on LW and working on UDT-ish math, I’ve been telling SIAI/MIRI folks that they should focus on public research output above all else. (Eliezer’s attitude back then was the complete opposite.) Eventually Luke came around to that point of view, and things started to change. But that took, like, five years of persuasion from me and other folks.
After reading su3su2u1′s post, I feel that growing closer to academia is another obviously good step. It’ll happen eventually, if MIRI is to have an impact. Why wait another five years to start? Why not start now?
The whole point of MIRI is to create an organization of a type that doesn’t currently exist, focused on much longer term goals. If you measure organizations on the basis of how many publications they make, you’re going to get a lot of low-quality publications. Citations are only slightly better, especially if you’re focused on ignored areas of research.
Just because MIRI researchers’ incentives aren’t distorted by “publish or perish” culture, it doesn’t mean they aren’t distorted by other things, especially those that are associated with lack of feedback and accountability.
If MIRI doesn’t publish reasonably frequently (via peer review), how do you know they aren’t wasting donor money? Donors can’t evaluate their stuff themselves, and MIRI doesn’t seem to submit a lot of stuff to peer review.
How do you know they aren’t just living it up in a very expensive part of the country doing the equivalent of freshman philosophizing in front of the white board. The way you usually know is via peer review—e.g. other people previously declared to have produced good things declare that MIRI produces good things.
If MIRI doesn’t publish reasonably frequently (via peer review), how do you know they aren’t wasting donor money?
How did science get done for the centuries before peer review? Why do you place such weight on such a recently invented construct like peer review (you may remember Einstein being so enraged by the first and only time he tried out this new thing called ‘peer review’ that he vowed to never again submit anything to a ‘peer reviewed’ journal), a construct which routinely fails anytime it’s evaluated and has been shown to be extremely unreliable where the same paper can be accepted and rejected based on chance? If peer-review is so good, why do so many terrible papers get published and great Nobel-prize-winning work rejected repeatedly? If peer review is such an effective method of divining quality, why do many communities seem to get along fine with desultory use of peer review where it’s barely used or left as the final step long after the results have been disseminated and evaluated and people don’t even bother to read the final peer-reviewed version (particularly in economics, I get the impression that everyone reads the preprints & working papers and the final publication comes as a non-event; which has caused me serious trouble in the past in trying to figure out what to cite and whether one cite is the same as another; and of course, I’m not always clear on where various statistics or machine learning papers get published, or if they are published in any sense beyond posting to ArXiv)? And why does all the real criticism and debate and refutations seem to take place on blogs & Twitter if peer-review is such an acid test of whether papers are gold or dross, leading to the growing need for altmetrics and other ways of dealing with the ‘post-publication peer review’ problem as journals increasingly fail to reflect where scientific debates actually are?
I’ve said it before and I’ll said it again: ‘peer review’ is not a core element of science. It’s barely even peripheral and unclear it adds anything on net. For the most part, calls for ‘peer review’ are cargo culting. What makes science work is replication and putting your work out there for community evaluation. Those are the real review by peers.
If you are a donor who wants to evaluate MIRI, whether some arbitrary reviewers pass or fail its papers is not very important. There are better measures of impact: is anyone building on their work? have MIRI-specific claims begun filtering out? are non-affiliated academics starting to move into the AI risk field? Heck, even citation counts would probably be better here.
How did science get done for the centuries before peer review? Why do you place such weight on such a
recently invented construct like peer review
Is this an “arguments as soldiers” thing? Compare an isomorphic argument: “how did medicine get done for the centuries before antibiotics.”
(you may remember Einstein being so enraged by the first and only time he tried out this new thing called ‘peer
review’ that he vowed to never again submit anything to a ‘peer reviewed’ journal),
Leaving aside that this an argument from authority, there is also selection bias here: peer review may well not be crucial—if you happen to be of Einstein’s caliber. But: “they also laughed at Bozo the Clown.” I am sure plenty of Bozos are enraged at peer review too, unjustly rejecting their crap.
a construct which routinely fails anytime it’s evaluated and has been shown to be extremely unreliable where
the same paper can be accepted and rejected based on chance?
There is a stochastic element to peer review, but in my experience it works remarkably well, given what it is. Good papers are very likely to get a fair shake and get published. I routinely get very penetrating comments that greatly improve the quality of the final paper. I almost always get help with scholarship from reviewers (e.g. this is probably a good paper to cite.) A bigger issue I saw was not chance, but ideology from reviewers. I very occasionally get bad reviews (<5% chance) and associate editors (people who handle the paper and assign reviewers) are almost always helpful in such cases.
I asked you this before, gwern, how much experience with actual peer review (let’s say in applied stats journals, as that is closest to what you do) do you have?
If peer-review is so good, why do so many terrible papers get published and great Nobel-prize-winning work
rejected repeatedly?
Absolute numbers are kind of useless here. Do you have some work in mind on false positive and false negative rates for peer review?
why do many communities seem to get along fine with desultory use of peer review where it’s barely used or
left as the final step long after the results have been disseminated and evaluated and people don’t even bother
to read the final peer-reviewed version
I don’t think we disagree here, I think this is a form of peer review. I routinely do this with my papers, and am asked to look over preprints by others. I think this is fine for certain types of papers (generally very specialized or very large/weighty ones).
The worry is MIRI’s conception of what a “peer” is basically ignores the wider academic community (which has a lot of intellectual firepower), so they end up in a bubble. The other worry is people who worry about getting tenured are incentivized to be productive (albeit imperfectly). MIRI is not incentivized to be productive except in some vague “saving the world” sense. And indeed, MIRI appears to be remarkably unproductive by academic standards. The guy who really calls the shots at MIRI, EY, has not internalized academic norms and appears to be fairly hostile to them.
I’ve said it before and I’ll said it again: ‘peer review’ is not a core element of science.
Honestly, you sound a bit angry about peer review.
Is this an “arguments as soldiers” thing? Compare an isomorphic argument: “how did medicine get done for the centuries before antibiotics.”
That’s not isomorphic. To put it bluntly, medicine didn’t. It only started becoming net beneficial extremely recently (and even now tons of medicine is harmful or a pure waste), based on copying a tremendous amount of basic science like biology and bacteriology and benefitting from others’ discoveries, and importing methodology like randomized trials (which it still chafes at) and not by importing peer review. Up until the very late 1800s or so, you would have been better off often ignoring doctors if you were, say, an expecting mother wondering whether to give birth in a hospital pre-Semmelweiss. You can’t expect too much too much help from a field which published its first RCT in 1948 (on, incidentally, an antibiotic).
Leaving aside that this an argument from authority,
I include it as a piquant anecdote since you seem to have no interest in looking up any of the statistical evidence on the unreliability and biases (in the statistical senses) of peer review, or the absence of any especial evidence that it works.
But: “they also laughed at Bozo the Clown.”
That is not what I am saying. I am saying, ‘if you think MIRI is Bozo the Clown, get a photograph of its leader and see if he has a red nose! See if his face is suspiciously white and the entire MIRI staff saves a remarkable amount on gas purchases because they can all fit into one small car to run their errands! Don’t deliberately look away and simply listen for the sound of laughter! That’s a terrible way of deciding!’
Good papers are very likely to get a fair shake and get published.
No, they’re not, or at the very least, you need to modify this to, ‘after being forced to repeatedly try solely thanks to the peer review process, a good paper may still finally be published’. For example, in the NIPS experiment, most accepted papers would not have been accepted given a different committee. Unsurprisingly! given low inter-rater reliabilities for tons of things in psychology far less complicated, and enormous variability when n=1 or 3.
Absolute numbers are kind of useless here. Do you have some work in mind on false positive and false negative rates for peer review?
Yes, any of it. They all say that peer review is not a little but highly stochastic. This isn’t a new field by any means.
I asked you this before, gwern, how much experience with actual peer review (let’s say in applied stats journals, as that is closest to what you do) do you have?
I have little first-hand experience; my vitriol comes mostly from having read over the literature showing peer-review to be highly unreliable, and biased, from the unthinking respect and overestimation of it that most people give it, being shocked at how awful many published studies are despite being ‘peer reviewed’, and from talking to researchers and learning how pervasive bias is in the process and how reviewers enforce particular cliques & theories (some politically-motivated) and try to snuff opposition in the cradle.
The first represents a huge waste of time; the second hinders scientific progress directly and contributes to one of the banes of my existence as a meta-analyst, publication bias (why do we have a ‘grey literature’ in the first place?); the third is seriously annoying in trying to get most people to wake up and think a little about the research they read about (‘but it’s peer-reviwed!’); and the fourth is simply enraging as the issue moves from an abstract, general science-wide problem to something I can directly perceive specifically harming me and my attempts to get accurate beliefs.
(Well, actually I think my analysis of Silk Road 2 listings is supposed to be peer-reviewed, but the lead author is handling the bureaucracy so I can’t say anything directly about how good or bad the reviewers for that journal are, aside from noting that this was a case of problem #4: the paper we were responding too is so egregiously, obviously wrong that the journal’s reviewers must have either been morons or totally ignorant of the paper topic they were supposed to be reviewing. I’m still shocked & baffled about this: how does an apparently respectable journal wind up publishing a paper claiming, essentially, that Silk Road 2 did not sell drugs? This would have been caught in a heartbeat by any kind of remotely public process—even one person who had actually used Silk Road 1 or 2 peeking in on the paper could have laughed it out of the room—but because the journal is ‘peer reviewed’… Pace the Gell-Man Effect, it makes me wonder about all the papers published about topics I am not so knowledgeable about as I am on Silk Road 2 and wonder if I am still not cynical enough.)
I don’t think we disagree here, I think this is a form of peer review. I routinely do this with my papers, and am asked to look over preprints by others. I think this is fine for certain types of papers (generally very specialized or very large/weighty ones).
Yes, I have no objection to ‘peer review’ if by what you mean is all the things I singled out as opposed to, and prior to, and afterwards, the institution of peer review: having colleagues critique your work, having many other people with different perspectives & knowledge check it over and replicate it and build on it and post essays rebutting it—all this is great stuff, we both agree. I would say replication is the most important of those elements, but all have their place.
What I am attacking is the very specific formal institutional practice of journals outsourcing editorial judgment to a few selected researchers and effectively giving them veto power, a process which hardly seems calculated to yield very good results and which does not seem to have been institutionalized because it has been rigorously demonstrated to work far better than the pre-existing alternatives (which of course it wasn’t, any more than medical proposals at that time were routinely put through RCTs first, even though we know how many good-sounding proposals in psychology & sociology & economics & medicine go down in flames when they are rigorously tested), but—to go off on a more speculative tangent here—whose chief purpose was to simply make the bureaucracy of science scale to the post-WWII expansion of science as part of the Cold War/Vannevar Bush academic-military-government complex.
The worry is MIRI’s conception of what a “peer” is basically ignores the wider academic community (which has a lot of intellectual firepower), so they end up in a bubble.
If this is the problem with MIRI, I think there are far more informative ways to criticize them. For example, I don’t think you need to rely on any proxies or filters: you should be able to evaluate their work directly and form your own critique of whether it’s any good or if it seems like a good research avenue for their stated goals.
Honestly, you sound a bit angry about peer review.
Science is srs bsns. (I find it hard to see why other people can’t get worked up over things like publication bias or aging or p-hacking. They’re a lot more important than the latest outrage du jour. This stuff matters!)
That’s not isomorphic. To put it bluntly, medicine didn’t.
Medicine was often harmful in the past, with some occasional parts that helped, e.g. amputating gangrenous limbs was dangerous and people died, but probably was still a benefit on net. Admiral Nelson had multiple surgeries and was in serious danger of infection and death afterwards, but he would have been a goner for sure without surgery.
Science was pretty similar, it was mostly nonsense with occasional islands of sense. It didn’t really get underway until, what, Francis Bacon wrote about biases and empiricism? That is not very long ago. The early “gentlemen scholars” all did informal peer review by sending their stuff to each other (they also hid discoveries from each other due to competition and egos, but this stuff happens today too).
you seem to have no interest...
Gwern, peer review is my life. My tenure case will be decided by peer review, ultimately. I do peer review myself as a service, constantly. I know all about peer review.
get a photograph of its leader and see if he has a red nose!
The burden of proof is on MIRI, not on me. MIRI is the one that wants funding and people to save the world. It’s up to MIRI to use all available financial and intellectual resources out there, which includes engaging with academia.
I have little first-hand experience; my vitriol comes mostly from having read over the literature showing peer-
review to be highly unreliable, and biased, from the unthinking respect and overestimation of it that most
people give it, being shocked at how awful many published studies are despite being ‘peer reviewed’, and from
talking to researchers and learning how pervasive bias is in the process and how reviewers enforce particular
cliques & theories (some politically-motivated) and try to snuff opposition in the cradle.
I really think you should moderate your criticism of peer review. Peer review for data analysis papers is very different from peer review for mathematics or theoretical physics. Fields are different and have vastly different cultural norms. Even in the same field, different conferences/journals may have different norms.
I find it hard to see why other people can’t get worked up over things like publication bias or aging or p-hacking.
I do a lot of theory. When I do data analysis, my collabs and I try to lead by example. What is the point of being angry? Angry outsiders just make people circle the wagons.
Admiral Nelson had multiple surgeries and was in serious danger of infection and death afterwards, but he would have been a goner for sure without surgery.
This argument seems exactly identical to the argument for trepanning, even including the survivorship bias. (One of the suspected uses of trepanning was to revive people otherwise thought dead.)
While we’re looking at anecdotes, this bit of Nelson’s experience with surgery seems relevant:
Although surgeons had been unable to remove the central ligature in his amputated arm, which had caused considerable inflammation and poisoning, in early December it came out of its own accord and Nelson rapidly began to recover.
I’m not sure I’d count that as a win for surgery, or evidence that he couldn’t have survived without it!
Gwern, peer review is my life. My tenure case will be decided by peer review, ultimately. I do peer review myself as a service, constantly. I know all about peer review.
But this means that, unless you’re particularly good at distancing yourself from your work, you should expect to be worse at judging it than a disinterested observer. The classic anecdote about “which half?” comes to mind, or the reaction of other obstetricians to Semmelweis’s concerns.
Regardless, we would expect that, if studies are better than anecdotes, studies on peer review will outperform anecdotes on peer review, right?
This argument seems exactly identical to the argument for trepanning, even including the survivorship bias.
It’s not identical because we know, with benefit of hindsight, that amputating potentially gangrenous limbs is a good idea. The folks in the past had solid empirical basis for amputations, even if they did not fully understand gangrene. Medicine was mostly, but not always nonsense in the past. A lot of the stuff was not based on the scientific method, because they had no scientific method. But there were isolated communities that came up with sensible things for sensible reasons. This is one case when standard practices were sensible (there are other isolated examples, e.g. honey to disinfect wounds).
But this means that, unless you’re particularly good at distancing yourself from your work, you should expect
to be worse at judging it than a disinterested observer.
studies on peer review will outperform anecdotes on peer review, right?
Ok, but isn’t this “incentive tennis?” Gwern’s incentives are clearer than mine here—he’s not a mainstream academic, so he loses out on status. So a “low motive” interpretation of the argument is: “your status castle is built on sand, tear it down!” Gwern is also pretty angry. Are we going to stockpile argument ammunition [X] of the form “you are more biased when evaluating peer review because of [X]”?
For me, peer review is a double edged sword—I get papers rejected sometimes, and at other times I get silly reviewer comments, or editors that make me spend years revising. I have a lot of data both ways. The point with peer review is I sleep better at night due to extra sanity checking. Who sanity-checks MIRI’s whiteboard stuff?
A “low motive” argument for me would be “keep peer review, but have it softball all my papers, they are obviously so amazing why can’t you people see that!”
A “low motive” argument for MIRI would be “look buddy, we are trying to save the world here, we don’t have time for your flawed human institutions. Don’t you worry about our whiteboard content, you probably don’t know enough math to understand it anyways.” MIRI is doing pretty theoretical decision theory. Is that a good idea? Are they producing enough substantive work? In standard academia peer review would help with the former question, and answering to the grant agency and tenure pressure would help with the second. These are not perfect incentives, but they are there. Right now there are absolutely no guard rails in place preventing MIRI from going off the deep end.
Your argument basically says not to trust domain experts, that’s the opposite of what should be done.
Gwern also completely ignores effect modification (e.g. the practice of evaluating conditional effects after conditioning on things like paper topic). Peer review cultures for empirical social science papers and for theoretical physics papers basically have nothing to do with each other.
The folks in the past had solid empirical basis for amputations, even if they did not fully understand gangrene.
I would put the start of solid empirical basis for gangrene treatment at Middleton Goldsmith during the American Civil War (dropping mortality from 45% to 3%), about sixty years after Nelson.
This is one case when standard practices were sensible (there are other isolated examples, e.g. honey to disinfect wounds).
I think this is putting too much weight on superficial resemblance. Yes, gangrene treatment from Goldsmith to today involves amputation. But that does not mean amputation pre-Goldsmith actually decreased mortality over no treatment! My priors are pretty strong that it would increase it, but going into details on my priors is perhaps a digression. (The short version is that I take a very Hansonian view of medicine and its efficacy.) I’m not aware of (but would greatly appreciate) any evidence on that question.
(To see where I’m coming from, consider that there is a reference class that contains both “trepanning” and “brian surgery” that seems about as natural as the reference class that includes amputation before and after Goldsmith.)
The point with peer review is I sleep better at night due to extra sanity checking.
But this only makes sense if peer review actually improves the quality of studies. Do you believe that’s the case, and if so, why?
Your argument basically says not to trust domain experts, that’s the opposite of what should be done.
I think my argument is domain expert tennis. That is, I think that in order to evaluate whether or not peer review is effective, we shouldn’t ask scientists who use peer review, we should ask scientists who study peer review. Similarly, in order to determine whether a treatment is effective, we shouldn’t ask the users of the treatment, but statisticians. If you go down to the church/synagogue/mosque, they’ll say that prayer is effective, and they’re obviously the domain experts on prayer. I’m just applying the same principles and same level of skepticism.
Gwern also completely ignores effect modification (e.g. the practice of evaluating conditional effects after conditioning on things like paper topic). Peer review cultures for empirical social science papers and for theoretical physics papers basically have nothing to do with each other.
I am not sure what the relevance of either of these are. If anything, the latter suggests that we need to make the case for peer review field by field, and so proponents have an even harder time than they do without that claim!
I would put the start of solid empirical basis for gangrene treatment at Middleton Goldsmith during the American
Civil War (dropping mortality from 45% to 3%), about sixty years after Nelson.
I think treating gangrene by amputation was well known in the ancient world. Depending on how you deal w/ hemorrhage/complications you would have a pretty high post-surgery mortality rate, but the point is, it is still an improvement on gangrene killing you for sure.
Actually, while I didn’t look into this, I expect Jewish and Greek surgeons would have been pretty good compared to medieval European ones.
But that does not mean amputation pre-Goldsmith actually decreased mortality over no treatment!
I don’t have data from the ancient world :). But mortality from gangrene if you leave the dead tissue in place is what, >95%? Amputation didn’t have to be perfect or even very good, it merely had to do better than an almost certain death sentence.
Do you believe that’s the case, and if so, why?
Well, because peer review would do things like say “your proof has a bug,” “you didn’t cite this important paper,” “this is an exact a very minor modification of [approach].” Peer review in my case is a social institution where smart knowledgeable people read my stuff.
You can say that’s heavily confounded by your field, the types of papers you write (or review), etc., and I agree! But that is of little relevance to gwern, he thinks the whole thing needs to be burned to the ground.
If anything, the latter suggests that we need to make the case for peer review field by field, and so proponents
have an even harder time than they do without that claim!
Not following. The claim “peer review sucks for all X,” is stronger than the claim “peer review sucks for some X.” The person making the stronger claim will have a harder time demonstrating it than the person making the weaker claim. So as a status quo defender, I have an easier time attacking the stronger claim.
I think treating gangrene by amputation was well known in the ancient world. Depending on how you deal w/ hemorrhage/complications you would have a pretty high post-surgery mortality rate, but the point is, it is stlil an improvement on gangrene killing you for sure.
I think you missed the meat of my claim; yes, al-Zharawi said to amputate as a response to gangrene, but that is not a solid empirical basis, and as a result it is not obvious that it actually extended lifespans on net. We don’t have the data to verify, and we don’t have reason to trust their methodology.
Now, maybe gangrene is a case where we can move away from priors on whether archaic surgery was net positive or net negative based on inside view reasoning. I’m not a doctor or a medical historian, and the one place I can think of to look for data (homeopathic treatment of gangrene) doesn’t seem to have any sort of aggregated data, just case reports of survival. Perhaps an actual medical historian could determine it one way or the other, or come up with a better estimate of the survival rate. But my guess is that 95% is a very high estimate.
You can say that’s heavily confounded by your field, the types of papers you write (or review), etc., and I agree!
I could, but why? I’ll simply point out that is not science, and that it’s not even trying to be science. It’s raw good intentions.
But that is of little relevance to gwern, he thinks the whole thing needs to be burned to the ground.
Suppose that the person on the street thinks that price caps on food are a good idea, because it would be morally wrong to gouge on necessities and the poor deserve to be able to afford to eat. Then someone comes along and points out that the frequent queues, or food shortages, or starvation, are a consequence of this policy, regardless of the policy’s intentions.
The person on the street is confused—but food being cheap is a good thing, why is this person so angry about price caps? They’re angry because of the difference between perception of policies and their actual consequences.
So as a status quo defender,
The claim I saw you as making is that peer review’s efficacy in field x is unrelated to its efficacy in field y. If true, that makes it harder for either of us to convince the other in either direction. I, with the null hypothesis that peer review does not add scientific value, would need to be convinced of peer review’s efficacy in every field separately. The situation is symmetric for you: your null hypothesis that peer review adds scientific value would need to be defeated in every field separately.
Now, whether or not our null hypothesis should be efficacy or lack of efficacy is a key component of this whole debate. How would you go about arguing that, say, to someone who believed that prayer caused rain?
I think you missed the meat of my claim; yes, al-Zharawi said to amputate as a response to gangrene, but
that is not a solid empirical basis
Why do you suppose he said this? People didn’t have Bacon’s method, but people had eyes, and accumulated experience. Neolithic people managed, over time, to figure out how all the useful plants in their biome are useful, how did they do it without science? “Science” isn’t this thing that came on a beam of light once Bacon finished his writings. Humans had bits and pieces of science right for a long time (heck, my favorite citation is a two arm nutrition trial in the Book of Daniel in the Old Testament).
Now, maybe gangrene is a case
We can ask a doc, but I am pretty sure post-wound gangrene is basically fatal if untreated.
I’ll simply point out that is not science
What is not science? My direct experience with peer review? “Science” is a method you use to tease things out from a disinterested Nature that hides the mechanism, but spits data at you. If you had direct causal access to a system, you would examine it directly. If I have a computer program on my laptop, I am not going to “do science” to it, I am going to look at it and see what it does.
Note that I am only talking about peer review I am familiar with. I am not making claims about social psychology peer review, because I don’t live in that world. It might be really bad—that’s for social psychologists to worry about. In fact, they are doing a lot of loud soul searching right now: system working as intended. The misdeeds of social psychology don’t really reflect on me or my field, we have our own norms. My only intersection with social psychology is me supplying them with useful mediation methodology sometimes.
I expect gwern’s policy of being really angry on the internet is going to have either a zero effect or a mildly negative effect on the problem.
consequence of this policy
The consequences of peer review for me is, on the receiving end, is generally people improve my paper (and sometimes are picky for silly reasons). The consequences of peer review for me, on the giving side, is I reject shitty papers, and make good and marginal papers better. I don’t need to “do science” to know this, I can just look at my pre-peer review and my post-peer review drafts, for instance. Or I can show you that the paper I rejected had an invalid theorem in it.
The claim I saw you as making is that peer review’s efficacy in field x is unrelated to its efficacy in field y.
I am making the claim that people who want to burn the whole system to the ground need to realize that academia is very large, and has very different social norms in different corners. A unified criticism isn’t really possible. Egregious cases of peer review are not hard to find, but that’s neither here nor there.
Why do you suppose he said this? People didn’t have Bacon’s method, but people had eyes, and accumulated experience.
Sure. I think al-Zharawi got observational evidence, but I think that there are systematic defects in how humans collect data from observation, which makes observational judgments naturally suspect. That is, I’m happy to take “al-Zharawi says X” as a good reason to promote X as a hypothesis worthy of testing, but I am more confident in reality’s entanglement with test results than proposed hypotheses.
“Science” isn’t this thing that came on a beam of light once Bacon finished his writings. Humans had bits and pieces of science right for a long time (heck my favorite citation is a two arm nutrition trial in the Book of Daniel in the Old Testament).
I very much agree that science is some combination of methodology and principles which was gradually discovered by humans, and categorically unlike revealed knowledge, whose core goal is the creation of maps that describe the territory as closely and correctly as possible. (To be clear, science in this view is not ‘having that goal,’ but actions and principles that actually lead to achieving that goal.)
We can ask a doc, but I am pretty sure post-wound gangrene is basically fatal if untreated.
I asked history.stackexchange; we’ll see if that produces anything useful. Asking doctors is also a good idea, but I don’t have as easy an in for that.
What is not science? My direct experience with peer review?
Not quite—what I had in mind as “not science” was confusing your direct experience with peer review and evaluation of the intentions as a scientific case for peer review.
Note that I am only talking about peer review I am familiar with.
Right now, sure, but we got onto this point because you thought not publishing with peer review means we can’t be sure MIRI isn’t wasting donor money, which makes sense primarily if we’re confident in peer review in MIRI’s field.
I expect gwern’s policy of being really angry on the internet is going to have either a zero effect or a mildly negative effect on the problem.
Eh. While I agree that being angry on the internet is unsightly, it’s not obvious to me that it’s ineffective at accomplishing useful goals.
I am making the claim that people who want to burn the whole system to the ground need to realize that academia is very large, and has very different social norms in different corners.
“Whole system” seems unclear. It’s pretty obvious to me that gwern wants to kill a specific element for solid reasons, as evidenced by the following quotes:
What makes science work is replication and putting your work out there for community evaluation. Those are the real review by peers. …
Yes, I have no objection to ‘peer review’ if by what you mean is all the things I singled out as opposed to, and prior to, and afterwards, the institution of peer review: having colleagues critique your work, having many other people with different perspectives & knowledge check it over and replicate it and build on it and post essays rebutting it—all this is great stuff, we both agree. I would say replication is the most important of those elements, but all have their place.
What I am attacking is the very specific formal institutional practice of journals outsourcing editorial judgment to a few selected researchers and effectively giving them veto power, a process which hardly seems calculated to yield very good results and which does not seem to have been institutionalized because it has been rigorously demonstrated to work far better than the pre-existing alternatives
I am making the claim that people who want to burn the whole system to the ground need to realize that academia is very large, and has very different social norms in different corners. A unified criticism isn’t really possible.
Would you agree that some parts of the system should be burned to the ground?
Peer review seems like a form of costly signalling. If you pass peer review, it only demonstrates that you have the ability to pass peer review. On the other hand, if you don’t pass peer review, it signals that you don’t have even this ability. (If so much crap passes peer review, why doesn’t your research? Is it even worse than the usual crap?)
This is why I recommend to treat “peer review” simply as a hoop you have to jump through, otherwise people will bother you about it endlessly. To remove the suspicion that your research is even worse than the stuff that already gets published.
The way you usually know is via peer review—e.g. other people previously declared to have produced good things declare that MIRI produces good things.
I think this isn’t really cutting to the heart of things—which seems to be ‘reputation among intellectuals,’ which is related to ‘reputation among academia,’ which is related to ‘journal articles survive the peer review process.’ It seems to me that the peer review process as it exists now is a pretty terrible way of capturing reputation among intellectuals, and that we could do something considerably better with the technology we have now.
I imagine a system where new Sciencecoins could be mined by doing valid scientific research, but then they could be used as a usual cryptocurrency. That would also solve the problem of funding research. :D
the first thing you have to do is check the new inside-view information available and see what’s really going on.
Isn’t it “cultish” to assume that an organization could do anything better than the high-status Academia? :P
Because many people seem to worry about publishing, I would probably treat it as another form of PR. PR is something that is not your main reason to exist, but you do in anyway, to survive socially. Maximizing the academic article production seems to fit here: it is not MIRI’s goal, but it would help to get MIRI accepted (or maybe not) and it would be good for advertising.
Therefore, AcademiaPR should be a separate department of MIRI, but it definitely should exist. It could probably be done by one person. The job of the person would be to maximize MIRI-related academic articles, without making it too costly for the organization.
One possible method that didn’t require even five minutes of thinking: Find smart university students who are interested in MIRI’s work but want to stay in academia. Invite them to MIRI’s workshops, make them familiar with what MIRI is doing but doesn’t care about publishing. Then offer them to become co-authors by taking the ideas, polishing them, and getting them published in academic journals. MIRI gets publications, the students get a new partially explored topic to write about; win/win. Also known as “division of labor”.
But PR also plays a role here, and this is how to fix it relatively cheaply. And it would also provide feedback about what people outside of MIRI think about MIRI’s research.
I think the primary purpose of peer review isn’t PR, but sanity checking. Peer reviewed publications shouldn’t be a concession to outsiders, but the primary means of getting work done.
One dictionary definition of academia is “the environment or community concerned with the pursuit of research, education, and scholarship.” By this definition MIRI is already part of academia. It’s just a separate academic island with tenuous links to the broader academic mainland.
MIRI is a research organization. If you maintain that it is outside of academia then you have to explain what exactly makes it different, and why it should be immune to the pressures of publishing.
If you measure organizations on the basis of how many publications they make, you’re going to get a lot of low-quality publications
Low-quality publications don’t get accepted and published. I know of no universities that would rather have a lot of third-rate publications than a small number of Nature publications. I’ll agree with you that things like impact factor aren’t good metrics but that’s somewhat missing the point here.
It seems like a lot of focus on MIRI giving good signals to outsiders. The “publish or perish” treadmill of academia is exactly why privately funded organizations like MIRI are needed.
The things that su3su2u1 wants MIRI to be already exist in academia. The whole point of MIRI is to create an organization of a type that doesn’t currently exist, focused on much longer term goals. If you measure organizations on the basis of how many publications they make, you’re going to get a lot of low-quality publications. Citations are only slightly better, especially if you’re focused on ignored areas of research.
If you have outside-view criticisms of an organization and you’re suddenly put in charge of them, the first thing you have to do is check the new inside-view information available and see what’s really going on.
Ever since I started hanging out on LW and working on UDT-ish math, I’ve been telling SIAI/MIRI folks that they should focus on public research output above all else. (Eliezer’s attitude back then was the complete opposite.) Eventually Luke came around to that point of view, and things started to change. But that took, like, five years of persuasion from me and other folks.
After reading su3su2u1′s post, I feel that growing closer to academia is another obviously good step. It’ll happen eventually, if MIRI is to have an impact. Why wait another five years to start? Why not start now?
+1
Just because MIRI researchers’ incentives aren’t distorted by “publish or perish” culture, it doesn’t mean they aren’t distorted by other things, especially those that are associated with lack of feedback and accountability.
If MIRI doesn’t publish reasonably frequently (via peer review), how do you know they aren’t wasting donor money? Donors can’t evaluate their stuff themselves, and MIRI doesn’t seem to submit a lot of stuff to peer review.
How do you know they aren’t just living it up in a very expensive part of the country doing the equivalent of freshman philosophizing in front of the white board. The way you usually know is via peer review—e.g. other people previously declared to have produced good things declare that MIRI produces good things.
How did science get done for the centuries before peer review? Why do you place such weight on such a recently invented construct like peer review (you may remember Einstein being so enraged by the first and only time he tried out this new thing called ‘peer review’ that he vowed to never again submit anything to a ‘peer reviewed’ journal), a construct which routinely fails anytime it’s evaluated and has been shown to be extremely unreliable where the same paper can be accepted and rejected based on chance? If peer-review is so good, why do so many terrible papers get published and great Nobel-prize-winning work rejected repeatedly? If peer review is such an effective method of divining quality, why do many communities seem to get along fine with desultory use of peer review where it’s barely used or left as the final step long after the results have been disseminated and evaluated and people don’t even bother to read the final peer-reviewed version (particularly in economics, I get the impression that everyone reads the preprints & working papers and the final publication comes as a non-event; which has caused me serious trouble in the past in trying to figure out what to cite and whether one cite is the same as another; and of course, I’m not always clear on where various statistics or machine learning papers get published, or if they are published in any sense beyond posting to ArXiv)? And why does all the real criticism and debate and refutations seem to take place on blogs & Twitter if peer-review is such an acid test of whether papers are gold or dross, leading to the growing need for altmetrics and other ways of dealing with the ‘post-publication peer review’ problem as journals increasingly fail to reflect where scientific debates actually are?
I’ve said it before and I’ll said it again: ‘peer review’ is not a core element of science. It’s barely even peripheral and unclear it adds anything on net. For the most part, calls for ‘peer review’ are cargo culting. What makes science work is replication and putting your work out there for community evaluation. Those are the real review by peers.
If you are a donor who wants to evaluate MIRI, whether some arbitrary reviewers pass or fail its papers is not very important. There are better measures of impact: is anyone building on their work? have MIRI-specific claims begun filtering out? are non-affiliated academics starting to move into the AI risk field? Heck, even citation counts would probably be better here.
Is this an “arguments as soldiers” thing? Compare an isomorphic argument: “how did medicine get done for the centuries before antibiotics.”
Leaving aside that this an argument from authority, there is also selection bias here: peer review may well not be crucial—if you happen to be of Einstein’s caliber. But: “they also laughed at Bozo the Clown.” I am sure plenty of Bozos are enraged at peer review too, unjustly rejecting their crap.
There is a stochastic element to peer review, but in my experience it works remarkably well, given what it is. Good papers are very likely to get a fair shake and get published. I routinely get very penetrating comments that greatly improve the quality of the final paper. I almost always get help with scholarship from reviewers (e.g. this is probably a good paper to cite.) A bigger issue I saw was not chance, but ideology from reviewers. I very occasionally get bad reviews (<5% chance) and associate editors (people who handle the paper and assign reviewers) are almost always helpful in such cases.
I asked you this before, gwern, how much experience with actual peer review (let’s say in applied stats journals, as that is closest to what you do) do you have?
Absolute numbers are kind of useless here. Do you have some work in mind on false positive and false negative rates for peer review?
I don’t think we disagree here, I think this is a form of peer review. I routinely do this with my papers, and am asked to look over preprints by others. I think this is fine for certain types of papers (generally very specialized or very large/weighty ones).
The worry is MIRI’s conception of what a “peer” is basically ignores the wider academic community (which has a lot of intellectual firepower), so they end up in a bubble. The other worry is people who worry about getting tenured are incentivized to be productive (albeit imperfectly). MIRI is not incentivized to be productive except in some vague “saving the world” sense. And indeed, MIRI appears to be remarkably unproductive by academic standards. The guy who really calls the shots at MIRI, EY, has not internalized academic norms and appears to be fairly hostile to them.
Honestly, you sound a bit angry about peer review.
That’s not isomorphic. To put it bluntly, medicine didn’t. It only started becoming net beneficial extremely recently (and even now tons of medicine is harmful or a pure waste), based on copying a tremendous amount of basic science like biology and bacteriology and benefitting from others’ discoveries, and importing methodology like randomized trials (which it still chafes at) and not by importing peer review. Up until the very late 1800s or so, you would have been better off often ignoring doctors if you were, say, an expecting mother wondering whether to give birth in a hospital pre-Semmelweiss. You can’t expect too much too much help from a field which published its first RCT in 1948 (on, incidentally, an antibiotic).
I include it as a piquant anecdote since you seem to have no interest in looking up any of the statistical evidence on the unreliability and biases (in the statistical senses) of peer review, or the absence of any especial evidence that it works.
That is not what I am saying. I am saying, ‘if you think MIRI is Bozo the Clown, get a photograph of its leader and see if he has a red nose! See if his face is suspiciously white and the entire MIRI staff saves a remarkable amount on gas purchases because they can all fit into one small car to run their errands! Don’t deliberately look away and simply listen for the sound of laughter! That’s a terrible way of deciding!’
No, they’re not, or at the very least, you need to modify this to, ‘after being forced to repeatedly try solely thanks to the peer review process, a good paper may still finally be published’. For example, in the NIPS experiment, most accepted papers would not have been accepted given a different committee. Unsurprisingly! given low inter-rater reliabilities for tons of things in psychology far less complicated, and enormous variability when n=1 or 3.
Yes, any of it. They all say that peer review is not a little but highly stochastic. This isn’t a new field by any means.
I have little first-hand experience; my vitriol comes mostly from having read over the literature showing peer-review to be highly unreliable, and biased, from the unthinking respect and overestimation of it that most people give it, being shocked at how awful many published studies are despite being ‘peer reviewed’, and from talking to researchers and learning how pervasive bias is in the process and how reviewers enforce particular cliques & theories (some politically-motivated) and try to snuff opposition in the cradle.
The first represents a huge waste of time; the second hinders scientific progress directly and contributes to one of the banes of my existence as a meta-analyst, publication bias (why do we have a ‘grey literature’ in the first place?); the third is seriously annoying in trying to get most people to wake up and think a little about the research they read about (‘but it’s peer-reviwed!’); and the fourth is simply enraging as the issue moves from an abstract, general science-wide problem to something I can directly perceive specifically harming me and my attempts to get accurate beliefs.
(Well, actually I think my analysis of Silk Road 2 listings is supposed to be peer-reviewed, but the lead author is handling the bureaucracy so I can’t say anything directly about how good or bad the reviewers for that journal are, aside from noting that this was a case of problem #4: the paper we were responding too is so egregiously, obviously wrong that the journal’s reviewers must have either been morons or totally ignorant of the paper topic they were supposed to be reviewing. I’m still shocked & baffled about this: how does an apparently respectable journal wind up publishing a paper claiming, essentially, that Silk Road 2 did not sell drugs? This would have been caught in a heartbeat by any kind of remotely public process—even one person who had actually used Silk Road 1 or 2 peeking in on the paper could have laughed it out of the room—but because the journal is ‘peer reviewed’… Pace the Gell-Man Effect, it makes me wonder about all the papers published about topics I am not so knowledgeable about as I am on Silk Road 2 and wonder if I am still not cynical enough.)
Yes, I have no objection to ‘peer review’ if by what you mean is all the things I singled out as opposed to, and prior to, and afterwards, the institution of peer review: having colleagues critique your work, having many other people with different perspectives & knowledge check it over and replicate it and build on it and post essays rebutting it—all this is great stuff, we both agree. I would say replication is the most important of those elements, but all have their place.
What I am attacking is the very specific formal institutional practice of journals outsourcing editorial judgment to a few selected researchers and effectively giving them veto power, a process which hardly seems calculated to yield very good results and which does not seem to have been institutionalized because it has been rigorously demonstrated to work far better than the pre-existing alternatives (which of course it wasn’t, any more than medical proposals at that time were routinely put through RCTs first, even though we know how many good-sounding proposals in psychology & sociology & economics & medicine go down in flames when they are rigorously tested), but—to go off on a more speculative tangent here—whose chief purpose was to simply make the bureaucracy of science scale to the post-WWII expansion of science as part of the Cold War/Vannevar Bush academic-military-government complex.
If this is the problem with MIRI, I think there are far more informative ways to criticize them. For example, I don’t think you need to rely on any proxies or filters: you should be able to evaluate their work directly and form your own critique of whether it’s any good or if it seems like a good research avenue for their stated goals.
Science is srs bsns. (I find it hard to see why other people can’t get worked up over things like publication bias or aging or p-hacking. They’re a lot more important than the latest outrage du jour. This stuff matters!)
Medicine was often harmful in the past, with some occasional parts that helped, e.g. amputating gangrenous limbs was dangerous and people died, but probably was still a benefit on net. Admiral Nelson had multiple surgeries and was in serious danger of infection and death afterwards, but he would have been a goner for sure without surgery.
Science was pretty similar, it was mostly nonsense with occasional islands of sense. It didn’t really get underway until, what, Francis Bacon wrote about biases and empiricism? That is not very long ago. The early “gentlemen scholars” all did informal peer review by sending their stuff to each other (they also hid discoveries from each other due to competition and egos, but this stuff happens today too).
Gwern, peer review is my life. My tenure case will be decided by peer review, ultimately. I do peer review myself as a service, constantly. I know all about peer review.
The burden of proof is on MIRI, not on me. MIRI is the one that wants funding and people to save the world. It’s up to MIRI to use all available financial and intellectual resources out there, which includes engaging with academia.
I really think you should moderate your criticism of peer review. Peer review for data analysis papers is very different from peer review for mathematics or theoretical physics. Fields are different and have vastly different cultural norms. Even in the same field, different conferences/journals may have different norms.
I do a lot of theory. When I do data analysis, my collabs and I try to lead by example. What is the point of being angry? Angry outsiders just make people circle the wagons.
This argument seems exactly identical to the argument for trepanning, even including the survivorship bias. (One of the suspected uses of trepanning was to revive people otherwise thought dead.)
While we’re looking at anecdotes, this bit of Nelson’s experience with surgery seems relevant:
I’m not sure I’d count that as a win for surgery, or evidence that he couldn’t have survived without it!
But this means that, unless you’re particularly good at distancing yourself from your work, you should expect to be worse at judging it than a disinterested observer. The classic anecdote about “which half?” comes to mind, or the reaction of other obstetricians to Semmelweis’s concerns.
Regardless, we would expect that, if studies are better than anecdotes, studies on peer review will outperform anecdotes on peer review, right?
It’s not identical because we know, with benefit of hindsight, that amputating potentially gangrenous limbs is a good idea. The folks in the past had solid empirical basis for amputations, even if they did not fully understand gangrene. Medicine was mostly, but not always nonsense in the past. A lot of the stuff was not based on the scientific method, because they had no scientific method. But there were isolated communities that came up with sensible things for sensible reasons. This is one case when standard practices were sensible (there are other isolated examples, e.g. honey to disinfect wounds).
Ok, but isn’t this “incentive tennis?” Gwern’s incentives are clearer than mine here—he’s not a mainstream academic, so he loses out on status. So a “low motive” interpretation of the argument is: “your status castle is built on sand, tear it down!” Gwern is also pretty angry. Are we going to stockpile argument ammunition [X] of the form “you are more biased when evaluating peer review because of [X]”?
For me, peer review is a double edged sword—I get papers rejected sometimes, and at other times I get silly reviewer comments, or editors that make me spend years revising. I have a lot of data both ways. The point with peer review is I sleep better at night due to extra sanity checking. Who sanity-checks MIRI’s whiteboard stuff?
A “low motive” argument for me would be “keep peer review, but have it softball all my papers, they are obviously so amazing why can’t you people see that!”
A “low motive” argument for MIRI would be “look buddy, we are trying to save the world here, we don’t have time for your flawed human institutions. Don’t you worry about our whiteboard content, you probably don’t know enough math to understand it anyways.” MIRI is doing pretty theoretical decision theory. Is that a good idea? Are they producing enough substantive work? In standard academia peer review would help with the former question, and answering to the grant agency and tenure pressure would help with the second. These are not perfect incentives, but they are there. Right now there are absolutely no guard rails in place preventing MIRI from going off the deep end.
Your argument basically says not to trust domain experts, that’s the opposite of what should be done.
Gwern also completely ignores effect modification (e.g. the practice of evaluating conditional effects after conditioning on things like paper topic). Peer review cultures for empirical social science papers and for theoretical physics papers basically have nothing to do with each other.
I would put the start of solid empirical basis for gangrene treatment at Middleton Goldsmith during the American Civil War (dropping mortality from 45% to 3%), about sixty years after Nelson.
I think this is putting too much weight on superficial resemblance. Yes, gangrene treatment from Goldsmith to today involves amputation. But that does not mean amputation pre-Goldsmith actually decreased mortality over no treatment! My priors are pretty strong that it would increase it, but going into details on my priors is perhaps a digression. (The short version is that I take a very Hansonian view of medicine and its efficacy.) I’m not aware of (but would greatly appreciate) any evidence on that question.
(To see where I’m coming from, consider that there is a reference class that contains both “trepanning” and “brian surgery” that seems about as natural as the reference class that includes amputation before and after Goldsmith.)
But this only makes sense if peer review actually improves the quality of studies. Do you believe that’s the case, and if so, why?
I think my argument is domain expert tennis. That is, I think that in order to evaluate whether or not peer review is effective, we shouldn’t ask scientists who use peer review, we should ask scientists who study peer review. Similarly, in order to determine whether a treatment is effective, we shouldn’t ask the users of the treatment, but statisticians. If you go down to the church/synagogue/mosque, they’ll say that prayer is effective, and they’re obviously the domain experts on prayer. I’m just applying the same principles and same level of skepticism.
I am not sure what the relevance of either of these are. If anything, the latter suggests that we need to make the case for peer review field by field, and so proponents have an even harder time than they do without that claim!
I think treating gangrene by amputation was well known in the ancient world. Depending on how you deal w/ hemorrhage/complications you would have a pretty high post-surgery mortality rate, but the point is, it is still an improvement on gangrene killing you for sure.
Actually, while I didn’t look into this, I expect Jewish and Greek surgeons would have been pretty good compared to medieval European ones.
I don’t have data from the ancient world :). But mortality from gangrene if you leave the dead tissue in place is what, >95%? Amputation didn’t have to be perfect or even very good, it merely had to do better than an almost certain death sentence.
Well, because peer review would do things like say “your proof has a bug,” “you didn’t cite this important paper,” “this is an exact a very minor modification of [approach].” Peer review in my case is a social institution where smart knowledgeable people read my stuff.
You can say that’s heavily confounded by your field, the types of papers you write (or review), etc., and I agree! But that is of little relevance to gwern, he thinks the whole thing needs to be burned to the ground.
Not following. The claim “peer review sucks for all X,” is stronger than the claim “peer review sucks for some X.” The person making the stronger claim will have a harder time demonstrating it than the person making the weaker claim. So as a status quo defender, I have an easier time attacking the stronger claim.
I think you missed the meat of my claim; yes, al-Zharawi said to amputate as a response to gangrene, but that is not a solid empirical basis, and as a result it is not obvious that it actually extended lifespans on net. We don’t have the data to verify, and we don’t have reason to trust their methodology.
Now, maybe gangrene is a case where we can move away from priors on whether archaic surgery was net positive or net negative based on inside view reasoning. I’m not a doctor or a medical historian, and the one place I can think of to look for data (homeopathic treatment of gangrene) doesn’t seem to have any sort of aggregated data, just case reports of survival. Perhaps an actual medical historian could determine it one way or the other, or come up with a better estimate of the survival rate. But my guess is that 95% is a very high estimate.
I could, but why? I’ll simply point out that is not science, and that it’s not even trying to be science. It’s raw good intentions.
Suppose that the person on the street thinks that price caps on food are a good idea, because it would be morally wrong to gouge on necessities and the poor deserve to be able to afford to eat. Then someone comes along and points out that the frequent queues, or food shortages, or starvation, are a consequence of this policy, regardless of the policy’s intentions.
The person on the street is confused—but food being cheap is a good thing, why is this person so angry about price caps? They’re angry because of the difference between perception of policies and their actual consequences.
The claim I saw you as making is that peer review’s efficacy in field x is unrelated to its efficacy in field y. If true, that makes it harder for either of us to convince the other in either direction. I, with the null hypothesis that peer review does not add scientific value, would need to be convinced of peer review’s efficacy in every field separately. The situation is symmetric for you: your null hypothesis that peer review adds scientific value would need to be defeated in every field separately.
Now, whether or not our null hypothesis should be efficacy or lack of efficacy is a key component of this whole debate. How would you go about arguing that, say, to someone who believed that prayer caused rain?
Why do you suppose he said this? People didn’t have Bacon’s method, but people had eyes, and accumulated experience. Neolithic people managed, over time, to figure out how all the useful plants in their biome are useful, how did they do it without science? “Science” isn’t this thing that came on a beam of light once Bacon finished his writings. Humans had bits and pieces of science right for a long time (heck, my favorite citation is a two arm nutrition trial in the Book of Daniel in the Old Testament).
We can ask a doc, but I am pretty sure post-wound gangrene is basically fatal if untreated.
What is not science? My direct experience with peer review? “Science” is a method you use to tease things out from a disinterested Nature that hides the mechanism, but spits data at you. If you had direct causal access to a system, you would examine it directly. If I have a computer program on my laptop, I am not going to “do science” to it, I am going to look at it and see what it does.
Note that I am only talking about peer review I am familiar with. I am not making claims about social psychology peer review, because I don’t live in that world. It might be really bad—that’s for social psychologists to worry about. In fact, they are doing a lot of loud soul searching right now: system working as intended. The misdeeds of social psychology don’t really reflect on me or my field, we have our own norms. My only intersection with social psychology is me supplying them with useful mediation methodology sometimes.
I expect gwern’s policy of being really angry on the internet is going to have either a zero effect or a mildly negative effect on the problem.
The consequences of peer review for me is, on the receiving end, is generally people improve my paper (and sometimes are picky for silly reasons). The consequences of peer review for me, on the giving side, is I reject shitty papers, and make good and marginal papers better. I don’t need to “do science” to know this, I can just look at my pre-peer review and my post-peer review drafts, for instance. Or I can show you that the paper I rejected had an invalid theorem in it.
I am making the claim that people who want to burn the whole system to the ground need to realize that academia is very large, and has very different social norms in different corners. A unified criticism isn’t really possible. Egregious cases of peer review are not hard to find, but that’s neither here nor there.
On the subject of medical advice, Scott and Scurvy reminded me of this conversation.
Sure. I think al-Zharawi got observational evidence, but I think that there are systematic defects in how humans collect data from observation, which makes observational judgments naturally suspect. That is, I’m happy to take “al-Zharawi says X” as a good reason to promote X as a hypothesis worthy of testing, but I am more confident in reality’s entanglement with test results than proposed hypotheses.
I very much agree that science is some combination of methodology and principles which was gradually discovered by humans, and categorically unlike revealed knowledge, whose core goal is the creation of maps that describe the territory as closely and correctly as possible. (To be clear, science in this view is not ‘having that goal,’ but actions and principles that actually lead to achieving that goal.)
I asked history.stackexchange; we’ll see if that produces anything useful. Asking doctors is also a good idea, but I don’t have as easy an in for that.
Not quite—what I had in mind as “not science” was confusing your direct experience with peer review and evaluation of the intentions as a scientific case for peer review.
Right now, sure, but we got onto this point because you thought not publishing with peer review means we can’t be sure MIRI isn’t wasting donor money, which makes sense primarily if we’re confident in peer review in MIRI’s field.
Eh. While I agree that being angry on the internet is unsightly, it’s not obvious to me that it’s ineffective at accomplishing useful goals.
“Whole system” seems unclear. It’s pretty obvious to me that gwern wants to kill a specific element for solid reasons, as evidenced by the following quotes:
Would you agree that some parts of the system should be burned to the ground?
Peer review seems like a form of costly signalling. If you pass peer review, it only demonstrates that you have the ability to pass peer review. On the other hand, if you don’t pass peer review, it signals that you don’t have even this ability. (If so much crap passes peer review, why doesn’t your research? Is it even worse than the usual crap?)
This is why I recommend to treat “peer review” simply as a hoop you have to jump through, otherwise people will bother you about it endlessly. To remove the suspicion that your research is even worse than the stuff that already gets published.
Mostly by well-off people satisfying their personal curiosity. Other than that, by finding a rich and/or powerful patron and keeping him amused :-D
I agree that the cult of peer review is overblown. But does MIRI produce any relevant and falsifiable output at all?
I would answer differently than you: “Very inefficiently and with lots of errors”.
As opposed to quick, reliable present-day peer-reviewed science? ;-)
Well, not that this has changed...
What leads you to that conclusion? When do you think peer review began and how do you judge efficiency before and after?
I think this isn’t really cutting to the heart of things—which seems to be ‘reputation among intellectuals,’ which is related to ‘reputation among academia,’ which is related to ‘journal articles survive the peer review process.’ It seems to me that the peer review process as it exists now is a pretty terrible way of capturing reputation among intellectuals, and that we could do something considerably better with the technology we have now.
Anyone suggested a system based on blockchain yet? X-)
I imagine a system where new Sciencecoins could be mined by doing valid scientific research, but then they could be used as a usual cryptocurrency. That would also solve the problem of funding research. :D
I think there’s definitely not enough thought given to this, especially when they say one of the main constraints is getting interested researchers.
Isn’t it “cultish” to assume that an organization could do anything better than the high-status Academia? :P
Because many people seem to worry about publishing, I would probably treat it as another form of PR. PR is something that is not your main reason to exist, but you do in anyway, to survive socially. Maximizing the academic article production seems to fit here: it is not MIRI’s goal, but it would help to get MIRI accepted (or maybe not) and it would be good for advertising.
Therefore, AcademiaPR should be a separate department of MIRI, but it definitely should exist. It could probably be done by one person. The job of the person would be to maximize MIRI-related academic articles, without making it too costly for the organization.
One possible method that didn’t require even five minutes of thinking: Find smart university students who are interested in MIRI’s work but want to stay in academia. Invite them to MIRI’s workshops, make them familiar with what MIRI is doing but doesn’t care about publishing. Then offer them to become co-authors by taking the ideas, polishing them, and getting them published in academic journals. MIRI gets publications, the students get a new partially explored topic to write about; win/win. Also known as “division of labor”.
Really? You can’t think of another reason to publish than PR?
I can.
But PR also plays a role here, and this is how to fix it relatively cheaply. And it would also provide feedback about what people outside of MIRI think about MIRI’s research.
I think the primary purpose of peer review isn’t PR, but sanity checking. Peer reviewed publications shouldn’t be a concession to outsiders, but the primary means of getting work done.
It seems that writing publishable papers isn’t easy.
Yes, GP’s is an extremely myopic and dangerous attitude.
One dictionary definition of academia is “the environment or community concerned with the pursuit of research, education, and scholarship.” By this definition MIRI is already part of academia. It’s just a separate academic island with tenuous links to the broader academic mainland.
MIRI is a research organization. If you maintain that it is outside of academia then you have to explain what exactly makes it different, and why it should be immune to the pressures of publishing.
Low-quality publications don’t get accepted and published. I know of no universities that would rather have a lot of third-rate publications than a small number of Nature publications. I’ll agree with you that things like impact factor aren’t good metrics but that’s somewhat missing the point here.