Abstract

Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.

Keywords

String (physics)KEGGBiologyComputational biologyComputer scienceCluster analysisGenomeData miningInteraction networkGene ontologySet (abstract data type)GeneMachine learningGenetics

MeSH Terms

AnimalsDatabasesGeneticGene OntologyGenomicsHumansProtein Interaction MappingSoftware

Affiliated Institutions

Related Publications

Publication Info

Year
2018
Type
article
Volume
47
Issue
D1
Pages
D607-D613
Citations
18024
Access
Closed

Citation Metrics

18024
OpenAlex
1413
Influential
14601
CrossRef

Cite This

Damian Szklarczyk, Annika L Gable, David Lyon et al. (2018). STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research , 47 (D1) , D607-D613. https://doi.org/10.1093/nar/gky1131

Identifiers

DOI
10.1093/nar/gky1131
PMID
30476243
PMCID
PMC6323986

Data Quality

Data completeness: 90%