A recent study (available open access here – hooray for real publication!) reported that based on survey responses by American women of childbearing age from 1999 to 2006, 13% of those who were pregnant were identified as being smokers which, frankly, is pretty good news considering that 30% of the non-pregnant women smoked. The study also reported that a quarter of the pregnant smokers claimed that they did not smoke. Guess which part of that the U.S. government researchers and the media chose to emphasize?
Yes, you guessed right. The headlines did not leave much room for other interpretations of the story. Or sympathy.
“Many pregnant smokers lie about their habit, study finds”
“Pregnant women deny smoking habit
“Smoking for Two and Lying About It“
The study was published in December and most of the news stories ran during the holidays, but it flared up this week with a New York Times story (the link above). It came to my attention when a health science education group that normally focuses on trying to debunk slipshod and politicized science reported on the claims without a hint of doubt or criticism. Everyone’s conclusion is that pregnant smokers were ashamed and so lied. The funny thing, though, is that the study results do not make a very good case for the claim.
We can start by observing that undoubtedly some pregnant smokers will claim to be nonsmokers when asked by family members, their health care providers, survey takers, or most anyone else who is likely to follow-up with criticism. The same is true for non-pregnant smokers. There is no news there. So how good is the evidence that lying on a survey is common? Not so good.
The survey included a collection of biological measures, including checking blood for cotinine, a chemical that is produced when your body metabolizes nicotine and thus commonly used as a biological test for nicotine use. This is how they can accuse someone of lying: 92 pregnant women in the sample identified themselves as smokers, but another 33 had high values for cotinine. The study excluded anyone who reported using non-cigarette sources of nicotine, so the researchers concluded those 33 must have all been guilty of “nondisclosure” of their smoking, rephrased by the press as “lying”.
One easy alternative explanation is that some of them had failed to report use of other nicotine products, most likely pharmaceutical nicotine (gum, patches, etc.) since few American women use smokeless tobacco or smoke cigars, and this predated the popularizing of e-cigarettes. Someone who did not smoke but used nicotine gum would have cotinine in her blood, and if she inaccurately reported her use of the gum she would have been included in the survey. Would someone have failed to report use of socially-acceptable pharma products? Perhaps – the messages to avoid using pharmaceutical nicotine while pregnant are almost as aggressive as those to avoid smoking (though smoking is undoubtedly much more harmful). Imagine a young woman trying to quit smoking upon learning she was pregnant, intending to use nicotine gum briefly to help and not willing to admit that she had not yet finished that process.
But there are less obvious but more important alternative explanations. Intentional misrepresentation – nondisclosure, lying, etc. – of smoking or some other nicotine source is not necessary. Errors happen all the time, in people’s survey answers and in how the data is recorded and handled. Questions about smoking are notoriously ambiguous. When asked “Do you now smoke cigarettes?” (the actual question used in this survey), many people who smoke some but not enough to consider themselves to be a smoker answer “no”, believing that to be the honest answer to the question.
But self-reported survey answers are not the only possible source of measurement error. The measurement and coding of cotinine status can go wrong in various ways. Indeed, more than 5% of those who self-identified as smokers had negative tests based on cotinine. As the researchers noted, non-smokers are extremely unlikely to intentionally claim to be smokers. But the researchers simply used this observation to justify calling someone a smoker if they self-reported such, even if their cotinine text was negative. They apparently did not understand the inconvenient implication of this observation: About 5% of their data about smoking status was wrong, one way or another.
[Interlude: This is an important thing for readers of health news to know in itself. Data is often wrong. 5% error is pretty bad, but probably not all that unusual. Often the errors sort of cancel out. For reasons I will probably explain later in this series, more often than not errors make results look less impressive, but sometimes they exaggerate them (and you can probably guess that the exaggerated results are the ones more likely to be published and reported).]
But having 5% bad data could not explain 25% of the women apparently lying, right? That depends on 5% of what. There were over 800 nonsmokers among the pregnant women. What if 825 nonsmokers correctly identified their nonsmoking status, but for just 4% of their cotinine tests were incorrect? By the definitions used by the researchers, this would turn 33 nonsmokers into supposed smokers who were lying about it. Basically the researchers designed their analysis (perhaps intentionally, though more likely because they just did not know any better) so that errors in the data supported their claim: If an error in measurement creates a self-reported smoker with low cotinine, they just declare her to be a smoker and she disappears in the results; if an error in measurement creates a self-reported nonsmoker with high cotinine she is declared to be a liar and becomes the result of the study.
This becomes even more interesting when we look at the non-pregnant smokers. The researchers and reporters really emphasized how the pregnant women were much more likely to “lie”, presumably because they felt more guilty than the non-pregnant ones. Only about 10% of the non-pregnant women were accused of lying about their status: 782 declared themselves to be smokers and another 106 were declared to be smokers from their cotinine test. There were over 2300 nonsmokers in the non-pregnant group, so if we apply that same 4% error rate to the nonsmokers we get just under 100 who are declared to falsely accused of lying. Thus, all of the “lies” can be explained by a very plausible rate of error in the data, and moreover, the entire “higher rate of lying” among pregnant women can be explained by the fact that the percentage of actual smokers was lower, so the same error rate among the nonsmokers would be a larger percentage of the pregnant smokers as compared to the non-pregnant smokers.
Cool, huh? I think so. This may be my new favorite teaching example.
This same phenomenon actually came up in yesterday’s post about autism diagnoses, though I did not mention it. I noted in passing that a theoretical increase in the tendency of certain parents (those with a short interval between their first two children) to worry that their second child is autistic could explain a much higher rate of diagnosed autism in those kids. How could a small tendency have such a large effect? Because well over 99% of children are not diagnosed autistic, so if some factor caused a tiny increase in the probability of diagnosing a child who would have have otherwise not been diagnosed, it could be a large number compared to the baseline rate of diagnosis. (E.g., if 1% would normally be diagnosed, but something caused a mere 1% of the others – who would not have been diagnosed – to be diagnosed, it would double the apparent risk.)
Of course just because a 4% false positive rate for the cotinine test could explain all the apparent misreporting and the entire headline-generating result does not mean that it actually does explain it all. Undoubtedly a few smokers really did say they were nonsmokers, and I would not be surprised if this were a bit higher for those who were pregnant. But there is no doubt that errors in the data explain some of the claimed result, and it seems quite plausible that those errors explain most of it. After all, let’s stop and think for a few seconds more than those reporting the story did: Why, exactly, would someone volunteer to do an anonymous survey and then choose lie on it? It is not as if the data collectors are going to nag them to stop smoking. Too bad the researchers did not think about that and look for an alternative explanation for their results (yes, perhaps they did figure it out and just hid it, but experience tells me that they just did not understand what they were doing).
So, in conclusion, apparently some people find it to be smugly satisfying to label as liars young women who are feeling bad about not taking better care of their babies, but personally I find working out the math to be much more so.