Abstract

A novel technique for similarity searching is introduced. Molecules are represented by atom environments, which are fed into an information-gain-based feature selection. A naïve Bayesian classifier is then employed for compound classification. The new method is tested by its ability to retrieve five sets of active molecules seeded in the MDL Drug Data Report (MDDR). In comparison experiments, the algorithm outperforms all current retrieval methods assessed here using two- and three-dimensional descriptors and offers insight into the significance of structural components for binding.

Keywords

Feature selectionPattern recognition (psychology)Classifier (UML)Artificial intelligenceComputer scienceBayesian probabilityNaive Bayes classifierSimilarity (geometry)Training setData miningMachine learningSupport vector machine

Affiliated Institutions

Related Publications

Publication Info

Year
2003
Type
article
Volume
44
Issue
1
Pages
170-178
Citations
246
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

246
OpenAlex

Cite This

Andreas Bender, Hamse Y. Mussa, Robert C. Glen et al. (2003). Molecular Similarity Searching Using Atom Environments, Information-Based Feature Selection, and a Naïve Bayesian Classifier. Journal of Chemical Information and Computer Sciences , 44 (1) , 170-178. https://doi.org/10.1021/ci034207y

Identifiers

DOI
10.1021/ci034207y