Abstract

Abstract Genome sequence assemblies provide the basis for our understanding of biology. Generating error-free assemblies is therefore the ultimate, but sadly still unachieved goal of a multitude of research projects. Despite the ever-advancing improvements in data generation, assembly algorithms and pipelines, no automated approach has so far reliably generated near error-free genome assemblies for eukaryotes. Whilst working towards improved datasets and fully automated pipelines, assembly evaluation and curation is actively used to bridge this shortcoming and significantly reduce the number of assembly errors. In addition to this increase in product value, the insights gained from assembly curation are fed back into the automated assembly strategy and contribute to notable improvements in genome assembly quality. We describe our tried and tested approach for assembly curation using gEVAL, the genome evaluation browser. We outline the procedures applied to genome curation using gEVAL and also our recommendations for assembly curation in a gEVAL-independent context to facilitate the uptake of genome curation in the wider community.

Keywords

Data curationContext (archaeology)GenomeComputer scienceComputational biologyData scienceQuality (philosophy)BiologyGeneticsGene

MeSH Terms

AlgorithmsEukaryotaGenomeGenomicsSoftware

Affiliated Institutions

Related Publications

Publication Info

Year
2021
Type
article
Volume
10
Issue
1
Citations
1650
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1650
OpenAlex
1623
CrossRef

Cite This

Kerstin Howe, William Chow, Joanna Collins et al. (2021). Significantly improving the quality of genome assemblies through curation. GigaScience , 10 (1) . https://doi.org/10.1093/gigascience/giaa153

Identifiers

DOI
10.1093/gigascience/giaa153
PMID
33420778
PMCID
PMC7794651

Data Quality

Data completeness: 86%