Abstract

Conventional observational epidemiology has an unenviable reputation for generating false-positive findings,1,2 or "scares," as others call them.3 In 1993, for example, the New York Times reported that "vitamin E greatly reduces the risk of heart disease"4 following simultaneous publication of 2 observational studies in the New England Journal of Medicine demonstrating that use of vitamin E supplements, even for just a few years, was associated with a substantially lower risk of coronary heart disease (CHD).5,6 Randomized controlled trials (RCTs)—testing precisely the same hypothesis—revealed no reduction in risk at all.7 As I write this, UK news media are reporting that "junk food makes your kids dumb," in response to a paper reporting that children given a poor diet at age 3 had lower IQ scores 5 years later.8 In the latter case, however, no RCTs are ever likely to be carried out, and the status of the finding will remain liminal. The ratio of false-positive to false-negative (FP:FN) published findings in traditional epidemiology is very high, John Ioannidis and colleagues argue.9 But what exactly are "false-positive" findings in the context of conventional observational epidemiology, and how can their prevalence be quantified? My suspicion is that use of vitamin E supplements would—in most contexts—be associated with lower risk of CHD, because of substantial confounding by a myriad of measured and unmeasured factors related to such a socially and behaviorally patterned exposure.10 The same will apply to children fed on junk food and their later intelligence.11 The inability to substantially "control" statistically for confounding in many situations—due to measurement error in assessed confounders and omission of others—remains underappreciated.12,13 In the sense that these associations do exist, they should perhaps not be called false positives; they are false positives only if they are taken to be indicators of underlying causal effects. Noncausal but replicable observational associations could clearly be of considerable value through prediction of disease risk, and targeting of preventive measures to those who can benefit most. For the holy grail of epidemiology—identifying causes of disease—they are a disappointment, however. A wide range of approaches—from formally comparing associations in contexts where confounding structures differ,14 utilizing correlates of the exposure under study that are not plausible causes15 and natural experiments,16 through to formal instrumental variables methods17—offer greater hope for reliable causal inference than plowing on with traditional approaches and keeping the FP:FN ratio high.9 Ioannidis and colleagues reiterate the low positive predictive value of a nominally "significant" P, something I (along with a generation of epidemiologists, I imagine) first encountered in Michael Oakes' seminal "Statistical Inference,18 although (as has been publically confessed) it did not instantly prevent me from using the word 'significant' and over-interpreting such tests.19As Sander Greenland20 argued in this journal 20 years ago, randomization provides the key link between inferential statistics and causal parameters." It is through this prism that we should consider Ioannidis' finding that in genetic association studies, once appropriate thresholds were applied,21 false positives became a rarity.9 Greenland stated that his arguments were "largely derived from the writings of R.A. Fisher,"20 and it was Fisher who clarified that randomization is inherent in genetic analysis. When lecturing on "Statistical Methods in Genetics" in 1951, Fisher clarified the relationship between the 2 disciplines to which he contributed so much22: And here I may mention a connection between our two subjects which seem not to be altogether accidental, namely that the factorial method of experimentation, now of lively concern so far afield as the psychologists, or the industrial chemists, derives its structure and its name, from the simultaneous inheritance of Mendelian factors. Geneticists certainly need not feel that the intellectual debt is all on one side.22 Fisher goes on: Genetics is indeed in a peculiarly favoured condition in that Providence has shielded the geneticist from many of the difficulties of a reliably controlled comparison. The different genotypes possible from the same mating have been beautifully randomised by the meiotic process.22 This principle—that analysis of genetic data is analogous to that of a randomized experiment—has, in epidemiology, been termed "Mendelian randomization."23 This depends on the basic (but approximate) laws of Mendelian genetics. If the probability that a postmeiotic germ cell that has received any particular allele at segregation contributes to a viable conceptus is independent of environment (following from Mendel's first law), and if genetic variants sort independently (following on from Mendel's second law), then these variants will not be associated with the confounding factors that generally distort conventional observational studies.24 Fisher was referring only to the implications of Mendel's second law when he stated that "A more perfect control of conditions is scarcely possible, than that of different genotypes appearing in the same litter,"22 which would imply that family-based analysis is required.24 If, however, basic precautions with respect to population stratification are applied, then, at a population level, genetic variants are indeed unrelated to nongenetic confounding factors, as has been empirically demonstrated.25 For Fisher, perhaps, Mendelian randomization provided the basis for formularizing randomization in experiments. Fisher appears to be saying this in his 1951 lecture, although his initial advocacy of randomization related to ensuring symmetry of the error distribution,26,27 and "ensur[ing] the validity of normal-theory analysis."28 His daughter, Joan Fisher Box, writes "the structure of the factorial experiment was borrowed, in all its efficiency and versatility, from genetics."29 She reminds us that the analysis of variance components was developed in Fisher's pioneering work on polygenic inheritance,30 and it was in the context of analysis of variance that he first hinted at randomization, when crop variation data were analyzed as "if all the plots are undifferentiated, as if the numbers had been mixed up and written down in random order,"31 Fisher's students similarly consider that his pioneering genetic work, in particular with respect to polygenic inheritance,30 was reflected in his later work on the design of experiments.32,33 Iain Chalmers has pointed out that statistical theory did not underlie the development of controlled trials in medicine,34 and it is entertaining to speculate that, rather than statistical theory, it was analogy with the factorial randomization of Mendel's second law that provided the basis for the development of randomized experiments in general. In biomedical science the key value of Mendelian randomization is that genetic variants can proxy for modifiable risk factors, and their randomization allows considerably greater inferential power than is provided in conventional observational epidemiology.24 The principles have been reviewed24,35,36 and a now very substantial body of empirical studies have been reported in the leading medical journals. Indeed, for biomarkers, this approach is rapidly becoming de rigueur.37,38 Obstacles to reliable interpretation from Mendelian randomization studies have been discussed at considerable length.24,35,36 Important issues include low statistical power of the instrumental variable analyses conducted within this framework and the possibility of confounding being reintroduced by pleiotropy—that is, possible multiple functional consequences of genetic variants. Here the right-hand term in Ioannidis' FP:FN ratio comes into play. The very low ratio with respect to properly conducted genetic studies reflects the fact that there are many false-negative findings and a very large number of common variants waiting to be identified through adequately powered studies.39 Large numbers of variants (eg, over 200 variants for height, 100 for circulating lipids, etc) have already been identified in genome-wide association studies, and the harvest will continue.40 These allow the construction of allele scores41 that explain more of the variance in the proxied-for phenotype and thus increase the power beyond that of the single-variant approaches used in most Mendelian randomization studies to date. Indeed, such allele scores often explain more of the variance in the phenotype than any potentially randomizable intervention could. More importantly, multiple variants (or independent combinations of variants) working through different pathways can be used as separate instruments. If these predict the same causal effect of the proxied-for environmentally modifiable risk factor, then it becomes much less plausible that reintroduced confounding (through pleiotropy, for example) explains the association, because the confounding would have to be acting in the same way for these 2 unlinked variants or combination of variants. This can be likened to RCTs of blood pressure–lowering agents (eg, diuretics and ACE inhibitors), which work through different biologic mechanisms and have different potential side effects. If the various agents produce the reductions in cardiovascular disease risk predicted by the degree to which they lower blood pressure, then it is unlikely that they act through agent-specific (pleiotropic) effects of the drugs; rather, such a finding points to blood pressure lowering as being key. The latter is indeed what is generally observed.42 Multiple variant approaches have been reported in the Mendelian randomization context,43 but they have not yet exploited the possibilities provided by large numbers of independent variants. These allow very large numbers of combinatio

Keywords

Observational studyConfoundingEpidemiologyRandomized controlled trialContext (archaeology)MedicineNutritional epidemiologyReputationInternal medicineHistory

Affiliated Institutions

Related Publications

Publication Info

Year
2011
Type
letter
Volume
22
Issue
4
Pages
460-463
Citations
47
Access
Closed

External Links

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

47
OpenAlex

Cite This

George Davey Smith (2011). Random Allocation in Observational Data. Epidemiology , 22 (4) , 460-463. https://doi.org/10.1097/ede.0b013e31821d0426

Identifiers

DOI
10.1097/ede.0b013e31821d0426