Abstract
AbstractDNA microarrays open up a broad new horizon for investigators interested in studying the genetic determinants of disease. The high throughput nature of these arrays, where differential expression for thousands of genes can be measured simultaneously, creates an enormous wealth of information, but also poses a challenge for data analysis because of the large multiple testing problem involved. The solution has generally been to focus on optimizing false-discovery rates while sacrificing power. The drawback of this approach is that more subtle expression differences will be missed that might give investigators more insight into the genetic environment necessary for a disease process to take hold. We introduce a new method for detecting differentially expressed genes based on a high-dimensional model selection technique, Bayesian ANOVA for microarrays (BAM), which strikes a balance between false rejections and false nonrejections. The basis of the new approach involves a weighted average of generalized ridge regression estimates that provides the benefits of using shrinkage estimation combined with model averaging. A simple graphical tool based on the amount of shrinkage is developed to visualize the trade-off between low false-discovery rates and finding more genes. Simulations are used to illustrate BAM's performance, and the method is applied to a large database of colon cancer gene expression data. Our working hypothesis in the colon cancer analysis is that large differential expressions may not be the only ones contributing to metastasis—in fact, moderate changes in expression of genes may be involved in modifying the genetic environment to a sufficient extent for metastasis to occur. A functional biological analysis of gene effects found by BAM, but not other false-discovery-based approaches, lends support to this hypothesis.KEY WORDS: Bayesian analysis of variance for microarraysFalse discovery rateFalse nondiscovery rateHeteroscedasticityRidge regressionShrinkageVariance stabilizing transformWeighted regression
Keywords
Affiliated Institutions
Related Publications
Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms
The hippocampal expression profiles of wild-type mice and mice transgenic for deltaC-doublecortin-like kinase were compared with Solexa/Illumina deep sequencing technology and f...
Improved statistical tests for differential gene expression by shrinking variance components estimates
Combining information across genes in the statistical analysis of microarray data is desirable because of the relatively small number of data points obtained for each individual...
Bayesian Robust Inference for Differential Gene Expression in Microarrays with Multiple Samples
Summary We consider the problem of identifying differentially expressed genes under different conditions using gene expression microarrays. Because of the many steps involved in...
Detecting Differential Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken Dispersion Estimates
Abstract Next generation sequencing technology provides a powerful tool for measuring gene expression (mRNA) levels in the form of RNA-sequence data. Method development for iden...
Significance analysis of microarrays applied to the ionizing radiation response
Microarrays can measure the expression of thousands of genes to identify changes in expression between different biological states. Methods are needed to determine the significa...
Publication Info
- Year
- 2003
- Type
- article
- Volume
- 98
- Issue
- 462
- Pages
- 438-455
- Citations
- 168
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1198/016214503000224