Abstract

Abstract Large‐scale DNA sequencing is creating a sequence infrastructure of great benefit to protein biochemistry. Concurrent with the application of large‐scale DNA sequencing to whole genome analysis, mass spectrometry has attained the capability to rapidly, and with remarkable sensitivity, determine weights and amino acid sequences of peptides. Computer algorithms have been developed to use the two different types of data generated by mass spectrometers to search sequence databases. When a protein is digested with a site‐specific protease, the molecular weights of the resulting collection of peptides, the mass map or fingerprint, can be determined using mass spectrometry. The molecular weights of the set of peptides derived from the digestion of a protein can then be used to identify the protein. Several different approaches have been developed. Protein identification using peptide mass mapping is an effective technique when studying organisms with completed genomes. A second method is based on the use of data created by tandem mass spectrometers. Tandem mass spectra contain highly specific information in the fragmentation pattern as well as sequence information. This information has been used to search databases of translated protein sequences as well as nucleotide databases such as expressed sequence tag (EST) sequences. The ability to search nucleotide databases is an advantage when analyzing data obtained from organisms whose genomes are not yet completed, but a large amount of expressed gene sequence is available ( e.g. , human and mouse). Furthermore, a strength of using tandem mass spectra to search databases is the ability to identify proteins present in fairly complex mixtures.

Keywords

Sequence databaseTandem mass spectrometryDatabase search engineMass spectrometryComputational biologyGenomeDatabasePeptide sequenceBottom-up proteomicsTandem mass tagProtein sequencingPeptide mass fingerprintingDNA sequencingReference genomeBiologyChemistryData miningComputer scienceProteomicsGeneticsProtein mass spectrometryGeneSearch engineInformation retrievalQuantitative proteomicsChromatography

Affiliated Institutions

Related Publications

NetAffx: Affymetrix probesets and annotations

NetAffx (http://www.affymetrix.com) details and annotates probesets on Affymetrix GeneChip microarrays. These annotations include (i) static information specific to the probeset...

2003 Nucleic Acids Research 486 citations

Publication Info

Year
1998
Type
review
Volume
19
Issue
6
Pages
893-900
Citations
231
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

231
OpenAlex

Cite This

John R. Yates (1998). Database searching using mass spectrometry data. Electrophoresis , 19 (6) , 893-900. https://doi.org/10.1002/elps.1150190604

Identifiers

DOI
10.1002/elps.1150190604