By Henry Miller and Stanley Young
Are you confused about conflicting “research” findings on certain foods’ effects on our health? It would hardly be surprising. First, butter is the enemy; then, it’s solid margarine. Is caffeine good or bad for your heart? For a time, beta-carotene supplements are thought to prevent cancer — until they are found to increase the risk of lung cancer in smokers. And finally, does a woman’s diet at conception determine the sex of her fetus?
When you do a deep dive into the methodology of studies that produced such conclusions, it’s not surprising that they’re inconsistent or implausible.
Let’s consider who performs such research.
A branch of pseudo-science has become a kind of throwback to a phenomenon that began centuries ago — a “guild,” a formal group with a common interest, or a collection of craftsmen or merchants that have mutual interests and standards. (Or, as discussed below, a lack of statistical standards.)
Entry into a guild originally was by apprenticeship where pay was low while a craft was being learned. There were goldsmiths’ guilds, and in Holland, even painters’ guilds. Sometimes, trade guilds ranged over large regions and often were aligned with or protected by governments. Medieval cities supported guild monopolies.
Today, some scientists — or, as they are more accurately described, “pseudoscientists” — have formed what amounts to a guild of researchers whose intent, it seems is to undermine the science of food.
Here’s what they do: They perform what are called “cohort studies” that include the results of questionnaires surveying responders’ food consumption habits. A cohort study circumscribes a group of people and follows them longitudinally over time, serially asking questions and testing associations, or correlations — for example, whether certain patterns of food consumption predispose to heart disease or dementia or alter longevity.
England’s “The Life Project,” funded by the U.K. government, was one of the first cohort studies. Over many years, it followed all the children born in a window of time. Eventually, new cohorts were formed and followed. The data from questions asked and answered in the 1958 The Life Project cohort has given rise to some 2,500 papers, and is still ongoing.
The kind and quantity of food consumed arguably affect health, so it is logical that cohort researchers want to collect information about food that could be related to subsequent health outcomes. To facilitate this, nutritional researchers at Harvard University developed a semi-quantitative “food frequency questionnaire,” or FFQ, in the mid-1980s.
The FFQ is a self-administered dietary questionnaire which asks people how often they consume specific foods. The initial FFQs had 61 foods. The choice of that particular number was probably intentional, because there is a 95% chance of at least one positive result if you ask 61 (independent) questions. 95% is commonly set as the level necessary for a study to statistically make its case.
Over the years, many cohort studies were started; and FFQs with more and more foods represented became a standard element of many of them. In a research project of the National Association of Scholars that examined a set of 105 papers concerned with the effects of red meat’s effects on health, the number of foods included ranged from 14 to 280, with a median of 51.
The questionnaires might ask, for example, how much citrus, broccoli, avocado, etc. the subject consumed, and what health problems — hypertension, depression, angina, asthma, obesity, erectile dysfunction, glaucoma, and so on — he or she experienced. The number of entries in both categories is large and, therefore, the possible associations are vast.
That presents something of a statistical dilemma — really a quagmire — because nutritional epidemiologists rarely, if ever, adjust their statistical analysis to account for the number of questions examined and the likelihood that a certain fraction of supposed correlations will arise by chance. Putting it another way, conducting large numbers of statistical tests — that is, searching for correlations in a study — produces many false positives by chance alone.
What’s concerning now?
A further complicating factor is that the supposed correlations can be sliced and diced according to gender, age, race, etc., and also at multiple time points. To shed light on this conundrum, the just-released landmark report from the National Association of Scholars (NAS) project, Shifting Sands: Unsound Science and Unsafe Regulation, examined how these kinds of chance correlations yield false, irreproducible conclusions that affect various areas of government policy and regulation at federal agencies. The report applies Multiple Testing and Multiple Modeling (MTMM) to assess whether a given body of research has been affected by such flawed practices, in which case the claim would be determined to be considered unreliable.
Non-statisticians might not appreciate the significance of this finding, but it presents a fundamental problem: Without appropriate adjustment for the number of claims examined, the research findings of nutritional researchers are often irreproducible. In other words, however persuasive or potentially important the findings seem, they are really meaningless.
What does this flawed approach of researchers who use nutritional food frequency questionnaires have to do with their constituting a guild? They keep their research data private; they don’t criticize one another over questionable statistical methods; they sell their monopoly “expertise” to the government in return for grants; and they police their trade themselves (i.e., they referee each other’s papers). They also protect their own: Several years ago, for example, fourteen of the “guild members” demanded that an article criticizing the research of one of the members be retracted.
Nutritional researchers have built a very comfortable, profitable, and convenient pseudo-science guild whose members know, or should know, that, in the interest of padding their article count, they’re sometimes publishing irreproducible findings. And, in the process, they are filling the scientific literature with tens of thousands of worthless food frequency questionnaire-based studies that confuse and frighten the public.
Consider, for example, the 2008 article, “You are what your mother eats: evidence for maternal preconception diet influencing foetal sex in humans.” It generated a lot of attention by making the genetically implausible claim that women who eat more breakfast cereal are more likely to have a boy child. That result is easily explained by chance. On what was the article based? On observational (cohort) studies.
That’s not a lone example. Claims from cohort studies (e.g. FFQs) about the effects on human health from various interventions such as vitamin intake or low-fat diets have not been supported by large randomized clinical trials.
What can we do about this situation? It’s difficult, because the status quo represents collusion among the dishonest researchers themselves, predatory journals that will publish anything for a hefty fee, and academic departments that overlook professional misconduct. They are all ‘benefiting’ in one way or another from easily generated but often specious research. Until one or more of them breaks the cycle, nothing will change.
Perhaps funding agencies, which have the power — and responsibility — to police scientific integrity will begin to do their job and crush the guild. We are not optimistic.
Henry I. Miller, a physician and molecular biologist, is a senior fellow at the Pacific Research Institute. He was the founding director of the FDA’s Office of Biotechnology and a Research Associate at the NIH. Find Henry on Twitter @henryimiller
Dr. S. Stanley Young is a Fellow of the American Statistical Association and the American Association for the Advancement of Science. He is an adjunct professor of statistics at North Carolina State University, the University of Waterloo and the University of British Columbia where he co-directs thesis work.