Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Jie Huang , Bryan Howie , Shane McCarthy , Jie Huang , Bryan Howie , Shane McCarthy , Yasin Memari , Klaudia Walter , Josine L. Min , Petr Danecek , Giovanni Malerba , Elisabetta Trabetti , Hou‐Feng Zheng , Saeed Al Turki , Antoinette Amuzu , Carl A. Anderson , Richard Anney , Dinu Antony , María Soler Artigas , Muhammad Ayub , Senduran Bala , Jeffrey C. Barrett , Inês Barroso , Phil Beales , Marianne Benn , Jamie Bentham , Shoumo Bhattacharya , Ewan Birney , Douglas Blackwood , Martin Bobrow , Elena G. Bochukova , Patrick Bolton , Rebecca Bounds , Chris Boustred , Gerome Breen , Mattia Calissano , Keren Carss , Juan P. Casas , John C. Chambers , Ruth Charlton , Krishna Chatterjee , Lu Chen , Antonio Ciampi , Sebahattin Çırak , Peter Clapham , Gail Clement , Guy Coates , Massimiliano Cocca , David Collier , Catherine Cosgrove , Tony Cox , Nick Craddock , Lucy Crooks , Sarah Curran , David Curtis , Allan Daly , Ian N.M. Day , Aaron Day-Williams , George Dedoussis , Thomas A. Down , Yuanping Du , Cornelia M. van Duijn , Ian Dunham , Sarah Edkins , Rosemary Ekong , Peter Ellis , David M. Evans , I. Sadaf Farooqi , David Fitzpatrick , Paul Flicek , James Floyd , A. Reghan Foley , Christopher S. Franklin , Marta Futema , Louise Gallagher , Paolo Gasparini , Tom R. Gaunt , Matthias Geihs , Daniel H. Geschwind , Celia M.T. Greenwood , Heather Griffin , Detelina Grozeva , Xiaosen Guo , Xueqin Guo , Hugh Gurling , Deborah Hart , Audrey E. Hendricks , Peter Holmans , Jie Huang , Tim Hubbard , Steve E. Humphries , Matthew E. Hurles , Pirro G. Hysi , Valentina Iotchkova , Aaron Isaacs , David K. Jackson , Yalda Jamshidi , Jon Johnson , Christopher Joyce , Konrad J. Karczewski , Jane Kaye , Thomas Keane , John P. Kemp
2015 Nature Communications 369 citations

Abstract

Abstract Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.

Keywords

HaplotypeImputation (statistics)Allele frequencyGeneticsComputational biologyComputer scienceBiologyGenotypeGeneMissing dataMachine learning

Affiliated Institutions

Related Publications

Publication Info

Year
2015
Type
article
Volume
6
Issue
1
Pages
8111-8111
Citations
369
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

369
OpenAlex

Cite This

Jie Huang, Bryan Howie, Shane McCarthy et al. (2015). Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nature Communications , 6 (1) , 8111-8111. https://doi.org/10.1038/ncomms9111

Identifiers

DOI
10.1038/ncomms9111