Unhealthful News 34 – faith in peer review would be charming if it were not so harmful

I am stuck on a slow train without wifi.  (Fast train with wifi was delayed to the point it was cancelled, which seems to happen only slightly less than half the time.  I got the bot call telling me that, long after I had already figured that out and rebooked and a few minutes after I was already on the other train.  It is fashionable to say that Amtrak (the U.S. passenger train system) could learn something from the Europeans, but honestly I think it is such a mess it could learn something from the airlines.)  Anyway, the reason I mention that is that I am left blogging about whatever is in my head or that I happen to have already open on my computer, rather than finding the most interesting health news story of the day.  Fortunately there was a confluence of those and thus an excuse to declare what I had on my mind a response to something that appeared today.

I am on my way back from giving testimony at a local government meeting about the health effects of wind turbines.  I came away from that with a strong feeling that I gave a New York county legislator far too hard a time the other day, and that I should offer an apology that I assume she will never read (fortunately this is because I assume she will never read the original post either).  You might recall that I was critical of the details of the proposed policy to regulate energy drinks that have lots of largely un-researched stimulants.  Speaking as someone who I know loyal readers recognize as a defender of autonomous choices about our health, I really think those stimulants need more regulations.  So if the national authorities are dropping the ball on it, maybe local well-meaning officials should do the best they can.  (Unfortunately we still run into the problem I cited in that post, of very unwise local officials doing very harmful things like banning e-cigarettes, so my forgiveness for this boldness knows bounds.)

I have participated in local and state government processes before, but never one where local officials (i.e., the undoubtedly above-average but still fairly ordinary folks from the community who are willing to given their time to be officials) were trying to sort out a scientific issue on their own.  But local authority over wind turbine siting is just too much to imagine.  I just watched a board of local volunteer part-time local officials trying to sort through the complicated science about wind turbines, something that I have spent most of a year learning about and I am still learning.  But what can we do?  If officials like this don’t take the lead, who will?

So, anyway, I offer my commendations to anyone who tries to work out a better way (than doing nothing) to regulate energy drinks and wind turbines, and my criticism of the system that makes them do it.

Meanwhile, the confluence with what was open on my computer was Snowdon writing today:

There are still some dear, naive souls who believe that the peer-review process weeds out blatant falsehoods like this. Bless ’em.

That was in the context of an obviously absurd “study” (actually just totally made up numbers), which claimed to show the absolutely absurd claim that promoting tobacco harm reduction would increase smoking prevalence.  This random bit of nonsense now shares the imprimatur of “peer reviewed journal article” with the discovery of the DNA double helix, the theory of relativity, the first studies that showed that smoking causes cancer, and the evidence that the MMR vaccine causes autism.  Oh wait, except that the first three of those were published in forms other than peer reviewed journal articles, only the last was peer reviewed – and it many years later it was declared to be fraudulent.

Here is a brief tutorial on peer review for the uninitiated:

You write a paper as best you can and send it to a journal.  Two reviewers who are probably among the top 10,000 people who might comment on the paper (but are likely not among the top 20) are asked to comment on it. 

Often they send back inane negative comments (often indicating that the reviewers did not bother to try to understand the paper, though sometimes meaning that the conclusions contradict their work so they want to keep it censored as long as possible) and it is rejected.

About half the time you get a “revise and resubmit” which generally means that you have to tweak a few points that are not central to the paper (like the tone of the introduction), and maybe report a few extra numbers, and then they publish it.  Sometimes the suggestions actually make the paper a tiny bit better by correcting wording problems and the like.

Does peer review make the paper better?  Well frequently it shows where a paper need to be clarified so that people who are reading it in five minutes will not get tripped up.  As for important correction, only once, in all the papers I ever submitted did I ever get a review that corrected a fundamental error in the paper and helped me correct it.  (Does that mean I have only ever submitted one paper with a fundamental error?  You be the judge :-) 

Do the reviewers ever correct errors in the data or data collection?  They cannot – they never even see the data or learn what the data collection methods were.  Do they correct errors in calculation or choices of statistical analysis?  They cannot.  They never even know what calculations were done or what statistics were considered.  Think about what you read when you see the final published paper.  That is all the reviewers and editors ever see too.  (Note I have always tried to go the extra mile when submitting papers, to make this system work by posting the data somewhere and offering to show someone the details of any analytic method that is not fully explained.  This behavior is rare to the point that I cannot name anyone else, offhand, who does it.)

Does this mean that if you just make up the data, peer review will almost certainly fail to detect the subterfuge?  Correct.

Does this mean that if you cherrypick your statistical analyses to exaggerate your results, that peer review will not be able to detect it?  Correct.

So what does a typical peer review report do?  It quibbles about the introduction and conclusions (i.e., the sections that no sensible reader ever pays attention to) and often censors research because the reviewer does not like the conclusions, does not understand it, or just does not find it interesting.  But if the paper is rejected, it simply forces the authors to publish it in another journal, to keep trying until they have the good luck to draw the reviewers who accept it.  It pretty much always happens if you keep at it, though in many cases someone does not keep at it long enough and the data is lost from the scientific record even if it might have been useful for something.

So, does it matter that most of the evidence that wind turbines cause health effects is based on reports of adverse events made by individuals, and so the evidence is not peer reviewed?  Of course not.  What possible improvement (other than grammar corrections) could come from peer review?  And besides, if I am reporting the evidence, that means that it is peer reviewed – I reviewed it!  Exactly what magic do people think that an official journal reviewer offers beyond what someone like me does?

But wait! I am told, frustratingly often.  Anyone can just publish anything they want on the internet, so how are we supposed to believe it?  There is something to that.  If I wanted to know something about, say, the moral character of whatever young actress is in the headlines right now, I would do an internet search and end up with about any answer I wanted, and not be able to make sense of them because I have no expertise in the matter what would allow me to sort them out.  But when someone with focused expertise and a deep interest in the question at hand reviews a collection of information, it offers such sorting.  We can only wish that the institutionalized journal peer review provided as much vouching for a claim’s legitimacy.

Advertisements

12 responses to “Unhealthful News 34 – faith in peer review would be charming if it were not so harmful

  1. Carl:
    Could you describe what you do when you are asked to review an article? Do you ask for the data? Do you ask for the code? Do authors meet your requests?

  2. Bernie, Good question. That would have been a good bit of info to include in the post.

    The answers to “what I do” and “what the typical reviewer does” are somewhat different. (Note: I am generalizing about reviews for medical and public health journals here. Journals that take a proper social science approach do better — I will post about that today or soon.)

    Reviewers (based on my experience as an author and journal editor, and other observations) pretty much never ask for data or programming code, or for any useful clarifications of it. They might ask some picky technical questions about it (you get the feeling that someone taught them “always do X” and that stuck in their head as something to bug everyone about throughout their career). But seldom ask anything that gets to the heart of the matter. Most reviewers comment at some length about interpretations of the results without ever suggesting an alternative analysis that would actually inform the points they are making. Instead, when they suggest something else that might have been done, it is usually useless (“you should have collected data on…”).

    When I do reviews, I actually seldom ask for those either, because I usually start with other questions or requests that are never answered or met. Almost always when I am asked to review a research study report type paper now, I write comments about what more I need to know before I can evaluate the paper. Basically, I restrict my comments to the “methods” section, perhaps indicating that I have additional thoughts about other parts but will only write those up after there is enough reported that I know what really was done. (I.e., knowing enough so that someone could replicate the study, more or less, and could definitely replicate the statistical analysis if they had the data.) I suspect that readers who are not researchers read methods sections (if they even read them) and think that somehow in those arcane words is a complete description of what the researchers did, but the truth is far from that. I usually insist on a better explanation of what was done (including what was tried and not reported) before I try to assess anything — no one can make much sense of what is done without that.

    This includes asking such things as which of the covariates they controlled for (confounders) had how much effect, which is critical to understanding the results, and is always reported in some other sciences, but is almost always hidden in epidemiology reports. (I usually suggest that if the authors are worried about word count, they can get rid of the 500 words they use to say in prose what already appears in the tables, which are much easier to read.) But, no, I seldom ask for the data or model, unless it is something I consider *really* important and am very concerned about the analysis. Honestly, it would just be way too much work to deal with that.

    …continued…

  3. …continuing…

    If I am an outside reviewer, my requests typically go nowhere, because the editors do not understand the point of what I am asking for or are not willing to push the authors to fix fundamental problems of missing information. Sometimes they just use it as an excuse to reject the paper, if that is what they wanted to do anyway, and sometimes just ignore it. If I am the editor I usually have rather more leverage, so sometimes these corrections get done. But my experience is that if I or another editor insisted on adding these useful bits of information, the author just withdrew the submission and took it to another journal where they could get away without reporting sufficient information. This is almost always the reaction from people who publish a lot in the field, though younger or outsider researchers often really appreciate my efforts to help them make the report better rather than to just declare it accepted or rejected. And on your specific question about the authors sharing the data and model, they never have in any case I asked for it (which, as I said, are cases where I was very interested in them getting it right and suspecting that they were not doing so). Presumably anyone who knows their data and model will not stand up to scrutiny will just refuse, just as they refuse to show more analyses — they know they can just publish somewhere else without anyone demanding they come clean.

    Needless to say, that my unorthodox approach of insisting on major shifts in what is presented saves me the hassle of getting invited to review by typical journals and most editors. Since my approach means that reviews take me serious time, unlike more typical cursory reviews, I am not inclined to do many of them except for journals I actively support. They just want to accept (with some simply changes) or reject. (There are exceptions, I should emphasize. But the negative stereotype seems to be the majority in anything public-health oriented.)

    A third category of reviewers are the minority who really know what they are doing (I think of them as the 1/100th of all people who review health submissions who are qualified to teach a third semester methods class) but do not take the approach I do. Typically they take guesses about the missing methods information and offer substantive takes on what is written. Actually, probably a good model of what they address/question/criticize are the majority of my posts in this series. For a few posts I take a lot of time and really try to understand the details of an analysis that I might not understand just treating as if I were a regular reader, as I would for reviewing a paper. For most post, though, I do not have that kind of time so I pick out an interesting feature or two and use it to remark on concerns that I am already familiar with and recognize an example of. Good reviewers who have not reached the point of refusing most reviewer requests have to do something like that.

  4. Excellent presentation on the process Carl! Did you cover the practice of authors recommending their friends to give peer reviews though? I believe I've seen a few papers on secondary smoke and such things where the reviewers were revealed and the list read like a roster of the “old boys' club.”

    And when Dave K. and I submitted our study on post-ban heart attacks five years ago (a study very similar in size, scope, and conclusions to the recent RAND/NBER/Stanford study that found no post-ban benefits) to the British Medical Journal it was reviewed by only one reviewer, not two. (See http://www.acsh.org/factsfears/newsID.990/news_detail.asp for the full story on that adventure. Our full study as submitted to the BMJ before reviewer suggested revisions for other journals can be seen at: http://www.scribd.com/doc/9679507/bmjmanuscript )

    Thanks again Carl. The review process has been shrouded in way too much mystery for far too long and is greatly overrated as a sound gatekeeper of quality research.

    Michael J. McFadden
    Author of “Dissecting Antismokers' Brains”

  5. Heh… one additional note that I think you might appreciate: The BMJ reviewer was open about his identity and gave a very helpful review. When we eventually submitted to Tobacco Control we got hit with three “attack reviewers” who were not helpful at all AND who also insisted on hiding their identities!

    – MJM

  6. Michael,
    Thanks for the example. It is pretty clear that the Tobacco Control “review” was a classic case of reviewers trying to censor something they do not like, the antithesis of scientific publishing. The BMJ experience sounds like part of the inevitable problem with paper journals, that they screen down to a particular quantity to fit between their covers, so reject things that should be published. Ideally you can then just publish elsewhere, but you seemed to run into a problem I have also experienced, which is that the highest quality journals (in terms of having high-quality editors and reviewers who know what is good and know how to write helpful reviews) decide something does not make their cut but the lower quality journals tend to have reviewers who are not competent to judge good work.

    Oh, and to clarify: It might sound like I am buying into to the “published in a prestigious journal” myth that laypeople something think is a judge of the quality of a paper. It is true that there is some correlation between quality and being published in a “better” journal, but it is pretty minimal. Spectacular results (which are more likely to be wrong than less dramatic results) are more likely to get into big name journals than are more solid studies. And sometimes the more “prestigious” journals (American Journal of Public Health, American Journal of Epidemiology, New England Journal of Medicine, PLoS) are reliable places to find the worst junk science. But it is true that you can submit a good paper that is over the heads of most editors and reviewers to, say, International Journal of Epidemiology, where the editors and reviewers understand it. And if it gets published you get lots of credit. But if they decide it does not make the cut to fit in their pages, even if it is useful and right, then you have a problem. If it is politically incorrect (like yours) it will get censored from topic-matter journals, but what is right about it may never be understood by most nonspecialist lower quality journals.

  7. I agree on the “journal quality” thing being over-emphasized. The critical thing is simply to be able to claim “published in a peer-reviewed journal.

    The situation Dave and I were in was a bit unique in that we were trying to put a nasty genie (Helena) back in its bottle before it could spread to infect others with its lie. In order to do that we had to have a strong refutation of it printed in the very journal that published it: THAT would have been newsworthy.

    By the time we went through a greatly overextended bottling up of our own work by the BMJ's editor and such things as a “mini” hanging committee, we were already down the line toward seeing other studies popping up “confirming” Helena. By that point, particularly after our following failed attempts in Circulation and Tobacco Control, our study, an unfunded effort by two unlettered researchers unconnected to any prestigious, or even faintly academic, institution, wouldn't have had a chance at a ghost of a shadow of a ripple in the media and would have been totally dismissed out of hand as a quirky, contrary, “suspicious” study by anyone in the field. We published our work on the web and wrote the ACSH article so at least it was “out there” and can now feel that it has been corroborated and confirmed… thereby exposing the BMJ's abdication of its responsibilities after its publication of Helena.

    Were we correct in dropping our efforts? Maybe, maybe not, but you can Google the news on the far more prestigious RAND/NBER/Stanford study that reached the same conclusions we did and you'll find nary a hint of headlines outside of a few bloggers. The “enemy” seems to have secured control of the media to an extent that I'm sure surprised even those researchers.

    What stood out for us in the BMJ's dismissal was the absolute blatant transparency of what they did. They obviously felt their control of the process was so absolute that they needn't even bother trying to come up with a reasonable excuse for rejection, based perhaps on our qualifications, or our methodology, or minor flaws I'm sure they could have found somewhere in our analysis; but instead to simply say it was being rejected primarily because it “added nothing new to what is already known…” despite the fact that it's conclusions were in diametrical and complete contradiction to “what was already known.”

    Heh… as for the overall value of peer-review, while it's a cheap shot, I've always appreciated the fact that after exhaustive peer review including a six month on-line period for final reworking, Stanton Glantz's heart attacks meta-analysis still led off with his concern about “pubic” smoking. See the first sentence of http://circ.ahajournals.org/cgi/content/short/120/14/1373

    :>
    MJM

  8. Michael,
    That is another good point. For many analyses, particularly those that are specifically aimed at debunking the claims in a published bad analysis, there is only one venue that makes a lot of sense. I too have submitted both papers and letters to journals pointing out that an article published in that journal was fundamentally flawed, or at least suspect, only to have them rejected. So then what? Why would anyone else publish a note like that? So then the comment pretty much dies.

    As in your case (though yours is a particularly impressive case of this, which anyone who is interested in what I write should read about if they have not already done so), I am talking about technical analyses that bring up very serious scientific questions, not some “I disagree” opinion letter that might appear in a newspaper. These responses are an absolutely critical part of scientific inquiry, but in health science they are almost always censored. You found an outlet for yours and I usually manage to work mine into something I write, but they are in the wrong place or delayed until the wrong time.

    I might go so far as to say that this is more the problem than is peer review not screening out bad analysis. It is a big problem that outsiders think (and dishonest politically-motivated insiders pretend) that peer review is a stamp of accuracy, but that could be overcome with education. It is not such a big problem that peer review is not a stamp of accuracy. What is a disaster is that the system, in health science, censors most criticism of what is peer reviewed and published out of the communication streams (journals) that do the original publishing. The current peer review system cannot do any better than it is doing — it does not have the capacity. Better exchanges of criticism, however, could.

    Of course technology has changed the problem, and may solve it. Now you and I, and everyone, can publish our criticisms of journal articles. And many readers, including all real scientists, have become sophisticated enough to realize that a lot (on some topics, the majority) of publishing takes place outside of journals. But without gatekeeping, it becomes difficult for the reader to figure out which of these (like your paper) are important and, frankly, better than what is in the journals, and which are no better than the average random letter to the Daily Shopper or comment at a town hall meeting.

    I will continue to pursue the topic in this series.

  9. Any chance you could provide citations for “except that the first three of those were published in forms other than peer reviewed journal articles, only the last was peer reviewed” ? It'd be a useful teaching tool.

  10. The first report of the double helix was in a letter to the editor — in a journal (Nature, 1953), and duly reviewed by an editor, but not the “peer review process”. Einstein first encountered the peer review we talk about today fairly late in his career and famously objected to it.

    I was a bit glib about the smoking reports, and Snowdon is really the authority on this, so I will see if he can jump in. I was referring to what were arguably the first population-based epidemiologic studies on the topic, released in Germany during the war years, and thus not part of the circle of peer reviewed journals. But offhand I do not recall what kind of review process they might have gone through within their narrow confines — I was making a bit of an assumption because it was a good thematic example. Can anyone improve provide something more concrete?

  11. The main German text identifying smoking as a cause of lung cancer was Frank Lickint's Tobacco and the Organism, published in 1939. This was a book running at over 1,000 pages long so I doubt it was peer-reviewed. During the war, several German studies found the same thing, notably: –

    Muller FH. Tabakmissbrauch und Lungencarcinom. Z Krebsforsch 1939;49:57–85.

    Schairer E, Schoniger E. Lungenkrebs und Tabakverbrauch.Z Krebsforsch 1943;54:261–9.

    I assume this journal was peer-reviewed at the time, but I can't say for sure. The BMJ and JAMA published the Doll/Hill and Wynder studies in 1950.

  12. Thanks, Chris.
    Ok, good — good to see that I was technically right, at the very least. I will have to check on the specific point that I was trying to make, about the journals. In 1939, what we now call peer review happened in science journals but was not universal, so I will check with some people who know about the German tradition, and if I find anything, post an update.

    In any case, the German publications made during the war did not have the “part of the insider club” status that is a lot of what people mean by peer reviewed publication. This contrasts with Watson and Crick's letter, which is often referred to as an article because they and the journal were thoroughly part of the club.

    The point, of course, is that those groundbreaking bits of work (and, by extension, many not so groundbreaking contributions), were no less important because of the details of how they were published. The review that showed them to be correct came after they were published. Having a couple of your cronies sign off on something before it is published provides no such assurance.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s