Abstract

Pfam is a widely used database of protein families, currently containing more than 13 000 manually curated protein families as of release 26.0. Pfam is available via servers in the UK (<inter-ref locator="http://pfam.sanger.ac.uk/" locator-type="url">http://pfam.sanger.ac.uk/</inter-ref>), the USA (<inter-ref locator="http://pfam.janelia.org/" locator-type="url">http://pfam.janelia.org/</inter-ref>) and Sweden (<inter-ref locator="http://pfam.sbc.su.se/" locator-type="url">http://pfam.sbc.su.se/</inter-ref>). Here, we report on changes that have occurred since our 2010 NAR paper (release 24.0). Over the last 2 years, we have generated 1840 new families and increased coverage of the UniProt Knowledgebase (UniProtKB) to nearly 80%. Notably, we have taken the step of opening up the annotation of our families to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand those pages. We continue to improve the Pfam website and add new visualizations, such as the ‘sunburst’ representation of taxonomic distribution of families. In this work we additionally address two topics that will be of particular interest to the Pfam community. First, we explain the definition and use of family-specific, manually curated gathering thresholds. Second, we discuss some of the features of domains of unknown function (also known as DUFs), which constitute a rapidly growing class of families within Pfam.

Keywords

UniProtAnnotationBiologyProtein familyComputational biologyFunction (biology)DatabaseBioinformaticsGeneticsComputer science

Affiliated Institutions

Related Publications

Publication Info

Year
2011
Type
article
Volume
40
Issue
D1
Pages
D290-D301
Citations
3728
Access
Closed

External Links

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

3728
OpenAlex

Cite This

Marco Punta, Penny Coggill, Ruth Y. Eberhardt et al. (2011). The Pfam protein families database. Nucleic Acids Research , 40 (D1) , D290-D301. https://doi.org/10.1093/nar/gkr1065

Identifiers

DOI
10.1093/nar/gkr1065