Support Vector Machine Classification of Microarray Data

Abstract

The Problem: Use the learning from examples paradigm to make class predictions and infer genes involved in these predictions from DNA microarray expression data. Specifically, we use a Support Vector Machine (SVM) classifier [6] to predict cancer morphologies and treatment success and determine the relevant genes in the inference. Motivation: Previous Work: Ageneric approach to classifying two types of acute leukemias was introduced in Golub et. al. [3]. SVM’s have been applied to this problem [5] and also to the problem of predicting functional roles of uncharacterized yeast ORF’s [1]. Approach: We used a SVM classifier to discriminate between two types of leukemia. The output of classical SVM’s isaclassdesignation ±1. Inthisparticularapplication it is important to be able to reject points for which the classifier is not confident enough. We introduced a confidence interval on the output of the SVM that allows us to reject points with low confidence values. It is also important in this application to infer which genes are important for the classification. We have preliminary results for a feature selection algorithm for SVM classifiers. The SVM was trained on the 38 points in the training set and tested on the 34 points in the test set. Our results (see table 2 and figure (1)) are the best reported so far for this dataset. genes rejects errors confidence level |d|

Keywords

Support vector machineArtificial intelligenceClassifier (UML)Machine learningComputer scienceInferencePattern recognition (psychology)Feature selectionMargin classifierStructured support vector machineGene selectionData miningMicroarray analysis techniquesGeneGene expressionBiology

Related Publications

Multiclass cancer diagnosis using tumor gene expression signatures

Sridhar Ramaswamy , Pablo Tamayo , Ryan Rifkin +12 more

The optimal treatment of patients with cancer depends on establishing accurate diagnoses by using a complex combination of clinical and histopathological data. In some instances...

2001 Proceedings of the National Academy o... 2033 citations

Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy

Hanchuan Peng , Fuhui Long , Chen Ding

Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion base...

2005 IEEE Transactions on Pattern Analysis... 10050 citations

Kernel Logistic Regression and the Import Vector Machine

Ji Zhu , Trevor Hastie

The support vector machine (SVM) is known for its good performance in two-class classification, but its extension to multiclass classification is still an ongoing research issue...

2001 136 citations

Fast Training of Support Vector Machines Using Sequential Minimal Optimization

John Platt

This chapter describes a new algorithm for training Support Vector Machines: Sequential Minimal Optimization, or SMO. Training a Support Vector Machine (SVM) requires the soluti...

1998 The MIT Press eBooks 5457 citations

Object Detection with Discriminatively Trained Part-Based Models

Pedro F. Felzenszwalb , Ross Girshick , David McAllester +1 more

We describe an object detection system based on mixtures of multiscale deformable part models. Our system is able to represent highly variable object classes and achieves state-...

2009 IEEE Transactions on Pattern Analysis... 9911 citations

Publication Info

Year: 2001
Type: article
Citations: 181
Access: Closed

External Links

Citation Metrics

181

OpenAlex

Cite This

APA Style

                            
                                    Sayan Mukherjee, 
                                
                                    Ryan Rifkin
                                
                            (2001). 
                            Support Vector Machine Classification of Microarray Data. 
                            
                            .