Abstract
A novel technique for similarity searching is introduced. Molecules are represented by atom environments, which are fed into an information-gain-based feature selection. A naïve Bayesian classifier is then employed for compound classification. The new method is tested by its ability to retrieve five sets of active molecules seeded in the MDL Drug Data Report (MDDR). In comparison experiments, the algorithm outperforms all current retrieval methods assessed here using two- and three-dimensional descriptors and offers insight into the significance of structural components for binding.
Keywords
Affiliated Institutions
Related Publications
Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy
Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion base...
Statistical pattern recognition: a review
The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated...
Publication Info
- Year
- 2003
- Type
- article
- Volume
- 44
- Issue
- 1
- Pages
- 170-178
- Citations
- 246
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1021/ci034207y