I decided to follow up on yesterday’s comments about the new study that used sneaky statistics to try to blame the weight-gain effects of men’s typical lifecycle on snus after I talked to Brad Rodu about some other interesting aspects of the study. We agreed that as Karolinska anti-snus broadsides go, it is not very important, but there are a few more interesting lessons about epidemiology and epidemiology-based propaganda to be found in it. (Notes: I am not going to repeat the background that I covered yesterday, so this post realy requires reading that one. A couple of these points are Brad’s but I am not going to specifically attribute out of concern for accidentally misrepresenting, but he is invited to claim credit or clarify in the comments if so desired.)
Yesterday I commented about how the results will likely be misused should the anti-snus propagandists decide to adopt it, that they will likely be interpreted as meaning that snus will cause about 1/3 of those who use it to gain weight. This is how the game will likely play out: The study estimated an odds ratio for snus users in the range of 1.3 for the the probability of gaining 5% of body weight during the study period. This OR is typically sloppily converted to prose as “snus users have a 30% greater chance”. But that phrase in quotes, because it is sloppy, is they often interpreted as meaning 30% of the population who would not have gained weight will do so (which is properly described as a 30 percentage point increase rather than a 30% increase, but almost no one gets that right). And presto-chango thanks to the power of using sloppy language in a scientific context (like “addiction”), nonsense is created.
Just how wrong would that claim be? Well setting aside the errors in the study and just pretending that the results are correct, of the nonusers, 790 of 3877 had the 5% weight gain, or about 20%. So if that is increased by about 1/3, we are talking about 6 or 7% of the population that gains weight because of snus. And, again, that is based on pretending that the analysis was correct (and once it is clear how small that number it, it also becomes clear how easy it would be for it all to be the result of study errors, intentional or unintentional).
An interesting question about epidemiology in general is why they do not just report the 7% or equivalent numbers. Why did they report this obscure statistics called “odds ratio” that facilitates sloppy prose and misinterpretations? You might be tempted to conclude that it is to make small numbers seem big, and there are no doubt those who are happy about that – an OR of 3 sounds a lot more impressive than “your lifetime chance of getting this disease increases from 1-in-1000 to 3-in-1000”. But the answer is actually both less political and more pathetic: The OR is the easiest number to get out of the statistical calculation for a logistic regression, which you may not know is the never-thought-through assumed relationship between exposure and disease in almost every epidemiology study you ever see. Or, put another way, it is the number that comes out of the software used by most people doing epidemiology, and that software is a black box to them such that they just copy down the results. People in other sciences are taught the math behind the calculations they do, such that they could write the software rather than just start blindly using it, but that is rarely the case in epidemiology (particularly including psych and clinical research).
Not that most epidemiology researchers could not figure out the risk difference that I reported, of course. That was easy. (The risk for nonusers was 20%, the risk for snus users was about 27%, and so the risk difference was 7%.) They might need some help to calculate the confidence interval, but at least they could report a meaningful point estimate. No one cares about the odds ratio. One of the first lessons in my courses when I was teaching how to make decisions based on epidemiologic evidence was how to take the useless statistics that are typically reported and convert them into something people actually care about. I have made this point before in Unhealthful News, but it is worth repeating. No one cares about ratios, and most people do not really even understand what they mean. If you read a lot of health news, you may think you know what it means when you hear there was an odds ratio (sometimes called by the generic term “relative risk”) of 2.5, but do you really?
As an aside, here is a good example on that point for those who care about snus and THR: Anti-THR activists like to try to claim that other sources of tobacco are lower risk than snus. There is no evidence to support that claim, but imagine it was true and some other smokeless products were only half as risky – say 0.5% as risky as smoking versus 1% as risky. These same activists like to insist that there is no difference in risk among varieties of cigarettes, which is obviously not true. It is a safe bet that cigarette risk varies by 5%. In ratio terms, the hypothetical factor of two reduction for smokeless seems a lot bigger than the mere reduction to .95 of what it would have been. But in absolute terms – what really matters in terms of whether someone gets a disease – this difference among cigarette varieties is ten times as great as any hypothetical difference among smokeless products. While the exact numbers are unknown, the magnitudes are probably about right, which means that there is no doubt that encouraging a smoker to switch to the lower risk cigarettes (whichever ones they might be) makes a lot more difference that encouraging a smokeless user to switch to the lower risk smokeless products (whichever ones they might be — again we do not actually know).
One last point on ORs: Most people intuitively interpret this as a risk ratio, the ratio of the risk for one group (the probably that the event occurred) divided by the other, rather than of the ratio of the odds for one group (the probability the event occurred divided by the probability it did not occur for one group) divided by the other. The odds defies intuition for anyone who does not gamble (gambles are typically described in terms of odds rather than probability/risk), though there are some technical reasons why it is often a legitimately better measure. Fortunately it does not matter much because they are about the same if the event in question is fairly rare. I mixed them together in my back-of-the-envelope calculations above.
Getting back to the study, there are a few other oddities about it. For one thing, there are a lot more people in the “Other” category then there were in any of the use categories that were cleanly defined and analyzed. This category includes anyone who had any history of using both snus and cigarettes. Presumably it also includes anyone who ever used tobacco in any other form. While there is no evidence that simply throwing out this group really matters, it might. Because it is so much larger than the exclusive snus groups, the authors should have told us what happens if people in borderline groups are included, like those who formerly smoked a bit but are now exclusive snus users being included with snus users. Chances are that this would not change much, but it would be interesting to know, as a reality check.
Definitions of exposure categories are far more arbitrary than the abstracts and press releases that most people read understand. For example, the methods report, “Cessation less than six months prior to baseline was regarded as current use.” Again, probably not a big deal, but, um, why? Could it be that doing this made the results come out better? Rather more interesting is why 5% weight gain is the measure of interest? Why not a different percentage, or an absolute change? Better still, why did they not compare the weight gain of the study subjects to the known average weight gain of those in the given age range (see yesterday’s post)? There is no explanation for why this arbitrary measure was chosen among the infinite possible choices, which should always be a cause for suspicion. It would have been trivial for them to report how the results varied if we go with 4% or 8%, but they did not tell us. Again, there might be no major differences, but it might be that most any other measure would have produced results they liked less – it would not be the first time this was ever done in the anti-smokeless-tobacco literature.
Finally, and perhaps most important, what is with the list of covariates they included? Alcohol use, fruit and berry consumption, and how often someone eats breakfast, but no other dietary measures – huh? As usual, there is no explanation for this rather odd choice. Also, why were these and the amount of physical exercise measure only included for the baseline measure and not the change at follow-up? They looked at tobacco use at both baseline and follow-up, so why the arbitrary choice to exclude it for exercise? Perhaps that explained the differences in the groups, but we will never know. It also could be that if there really is an effect of snus it is because one of these is an intermediate step in the causal pathway, and should not have been controlled for, but I suspect that is way over the heads of the authors. Also, it is worth noting that alcohol use was much higher among snus users with “heavy” use exceeding 40% of the population compared to less than 20% for non-tobacco-users. That is probably just another effect of the age difference I talked about yesterday, but if it really does matter (and it might) then they did not effectively control for it by simply dividing people into “heavy” and “moderate” categories (it turns out there are no light drinkers among Swedish men). This is another example of when “we controlled for that” is misleading, since there is no way these two categories can capture all of the effects.
The bottom line for a lot of this is that the reader always must put a huge amount of trust in the honesty and skill of epidemiologic researchers. They inevitably make arbitrary choices which can affect their results. And chances are that no one will ever even check their math, let alone their data quality (you can be certain that the “peer reviewers” never did, despite the widespread misperception that peer review actually vouches for the accuracy of analyses). So when you are dealing with research groups that have already demonstrated their dishonesty, the sensible strategy is to assume that the results are wrong.
Of course, we can still always learn something. In this case, if the best they could cook up was ORs of 1.3 or so, we can parse the results of the study and our existing knowledge and conclude that this shows that snus use apparently does not cause weight gain, which is just as we would have predicted.