Assessment of Genome-Wide Protein Function Classification for <i>Drosophila melanogaster</i>

Abstract

The functional classification of genes on a genome-wide scale is now in its infancy, and we make a first attempt to assess existing methods and identify sources of error. To this end, we compared two independent efforts for associating proteins with functions, one implemented by FlyBase and the other by PANTHER at Celera Genomics. Both methods make inferences based on sequence similarity and the available experimental evidence. However, they differ considerably in methodology and process. Overall, assuming that the systematic error across the two methods is relatively small, we find the protein-to-function association error rate of both the FlyBase and PANTHER methods to be <2%. The primary source of error for both methods appears to be simple human error. Although homology-based inference can certainly cause errors in annotation, our analysis indicates that the frequency of such errors is relatively small compared with the number of correct inferences. Moreover, these homology errors can be minimized by careful tree-based inference, such as that implemented in PANTHER. Often, functional associations are made by one method and not the other, indicating that one of the greatest challenges lies in improving the completeness of available ontology associations.

Keywords

InferenceBiologyComputational biologyAnnotationGenomicsDrosophila melanogasterGenomeComputer scienceGeneticsArtificial intelligenceGene

Affiliated Institutions

Related Publications

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin

Nicholas A. Bokulich , Benjamin D. Kaehler , Jai Ram Rideout +5 more

Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. We present q2-feature-classifier (https://github.com/qiime2/q2-feature-classifier)...

2018 Microbiome 5410 citations

DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update)

Brad T. Sherman , Ming Hao , Ju Qiu +5 more

Abstract DAVID is a popular bioinformatics resource system including a web server and web service for functional annotation and enrichment analyses of gene lists. It consists of...

2022 Nucleic Acids Research 5363 citations

eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale

Carlos P. Cantalapiedra , Ana Hernández-Plaza , Ivica Letunić +2 more

Abstract Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. He...

2021 Molecular Biology and Evolution 3673 citations

Empirical Bayes Analysis of a Microarray Experiment

Bradley Efron , Robert Tibshirani , John D. Storey +1 more

AbstractMicroarrays are a novel technology that facilitates the simultaneous measurement of thousands of gene expression levels. A typical microarray experiment can produce mill...

2001 Journal of the American Statistical A... 1755 citations

Genic Intolerance to Functional Variation and the Interpretation of Personal Genomes

Slavé Petrovski , Quanli Wang , Erin L. Heinzen +2 more

A central challenge in interpreting personal genomes is determining which mutations most likely influence disease. Although progress has been made in scoring the functional impa...

2013 PLoS Genetics 975 citations

Publication Info

Year: 2003
Type: article
Volume: 13
Issue: 9
Pages: 2118-2128
Citations: 46
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Assessment of Genome-Wide Protein Function Classification for <i>Drosophila melanogaster</i>

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

OpenAlex

Cite This

APA Style

                            
                                
                                    Huaiyu Mi, 
                                
                                    Jody Vandergriff, 
                                
                                    Michael J. Campbell
                                
                                et al.
                            
                            (2003). 
                            Assessment of Genome-Wide Protein Function Classification for <i>Drosophila melanogaster</i>. 
                            Genome Research
                            , 13
                            (9)
                            , 2118-2128.
                            https://doi.org/10.1101/gr.771603
                        

Identifiers

DOI: 10.1101/gr.771603