UniProt: the Universal Protein Knowledgebase in 2023

Alex Bateman , María Martin , Sandra Orchard , Alex Bateman , María Martin , Sandra Orchard , Michele Magrane , Shadab Ahmad , Emanuele Alpi , Emily Bowler-Barnett , Ramona Britto , Hema Bye‐A‐Jee , Austra Cukura , Paul Denny , Tunca Doğan , ThankGod E. Ebenezer , Jun Fan , Penelope Garmiri , Leonardo Jose da Costa Gonzales , Emma Hatton-Ellis , Abdulrahman Hussein , Alexandr Ignatchenko , Giuseppe Insana , Rizwan Ishtiaq , Vishal Joshi , Dushyanth Jyothi , Swaathi Kandasaamy , Antonia Lock , Aurélien Luciani , Marija Lugaric , Jie Luo , Yvonne Lussi , Alistair MacDougall , Fábio Madeira , Mahdi Mahmoudy , Alok Mishra , Katie Moulang , Andrew Nightingale , Sangya Pundir , Guoying Qi , Shriya Raj , Pedro Raposo , Daniel L Rice , Rabie Saidi , Rafael Santos , Elena Speretta , James Stephenson , Prabhat Totoo , E. B. Turner , Nidhi Tyagi , Preethi Vasudev , Kate Warner , Xavier Watkins , Rossana Zaru , Hermann Zellner , Alan Bridge , Lucila Aimo , Ghislaine Argoud‐Puy , Andrea H Auchincloss , Kristian B. Axelsen , Parit Bansal , Delphine Baratin , Teresa M Batista Neto , Marie-Claude Blatter , Jerven Bolleman , Emmanuel Boutet , Lionel Breuza , Blanca Cabrera Gil , Cristina Casals‐Casas , Kamal Chikh Echioukh , Elisabeth Coudert , Beatrice Cuche , Edouard de Castro , Anne Estreicher , Maria Livia Famiglietti , Marc Feuermann , Elisabeth Gasteiger , Pascale Gaudet , Sébastien Géhant , Vivienne Baillie Gerritsen , Arnaud Gos , Nadine Gruaz , Chantal Hulo , Nevila Hyka‐Nouspikel , Florence Jungo , Arnaud Kerhornou , Philippe Le Mercier , Damien Lieberherr , Patrick Masson , Anne Morgat , Venkatesh Muthukrishnan , Salvo Paesano , Ivo Pedruzzi , Sandrine Pilbout , Lucille Pourcel , Sylvain Poux , Monica Pozzato , Manuela Pruess , Nicole Redaschi , Catherine Rivoire , Christian J A Sigrist , Karin Sonesson , Shyamala Sundaram
2022 Nucleic Acids Research 5,996 citations

Abstract

Abstract The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this publication we describe enhancements made to our data processing pipeline and to our website to adapt to an ever-increasing information content. The number of sequences in UniProtKB has risen to over 227 million and we are working towards including a reference proteome for each taxonomic group. We continue to extract detailed annotations from the literature to update or create reviewed entries, while unreviewed entries are supplemented with annotations provided by automated systems using a variety of machine-learning techniques. In addition, the scientific community continues their contributions of publications and annotations to UniProt entries of their interest. Finally, we describe our new website (https://www.uniprot.org/), designed to enhance our users’ experience and make our data easily accessible to the research community. This interface includes access to AlphaFold structures for more than 85% of all entries as well as improved visualisations for subcellular localisation of proteins.

Keywords

UniProtPipeline (software)BiologyProteomeHuman proteome projectComputer scienceSet (abstract data type)Interface (matter)World Wide WebInformation retrievalData scienceComputational biologyBioinformaticsProteomicsGenetics

MeSH Terms

Amino Acid SequenceDatabasesProteinKnowledge BasesMachine LearningProteome

Related Publications

Publication Info

Year
2022
Type
article
Volume
51
Issue
D1
Pages
D523-D531
Citations
5996
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

5996
OpenAlex
217
Influential
5415
CrossRef

Cite This

Alex Bateman, María Martin, Sandra Orchard et al. (2022). UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Research , 51 (D1) , D523-D531. https://doi.org/10.1093/nar/gkac1052

Identifiers

DOI
10.1093/nar/gkac1052
PMID
36408920
PMCID
PMC9825514

Data Quality

Data completeness: 90%