Abstract
Abstract In this study, we present a comprehensive evaluation framework for comparing various combinations of artificial intelligence (AI) methods in the context of explainable AI (XAI) for variable selection in experimental biological and biomedical data. Our goal was to assess the efficiency, computational cost, and accuracy of different method combinations across six simulated scenarios, each replicated ten times. These scenarios encompass various classification and regression complexities, including variance differences, bimodal distributions, eXclusive-OR (XOR) interactions, concentric circles, and nonlinear relationships such as parabolic and sinusoidal functions. We tested several machine learning algorithms, including Decision Trees (DT), Random Forests (RF), Support Vector Machines (SVM), and Multi-Layer Perceptrons (MLP). We combined these models with diverse feature-importance methods such as Gini importance, accuracy decrease, SHAP values (Shapley Additive exPlanations), and Olden’s method. We further applied significance-thresholding approaches, namely PIMP (Permutation IMPortance), mProbes, and the novel simThresh developed for this study. Additionally, we explored different dataset sizes to evaluate the scalability of these methods. Our analysis revealed substantial differences in computational demands, ranging from very rapid evaluations (e.g., DT combined with Gini importance and simThresh averaging 0.15 seconds) to extensive computations (e.g., MLP combined with SHAP and PIMP exceeding 7 hours). Among the tested combinations, RF/Accuracy/PIMP achieved the best overall performance, successfully identifying 59 out of 60 replicates in our benchmark study. However, this approach raises concerns regarding its scalability when applied to large-scale omics datasets in real-world settings due to its computational demands. In contrast, Decision Tree or Random Forest models using Gini and simThresh criteria ranked second, with 50 out of 60 detections. While less accurate, these methods require fewer computational resources, making them more promising candidates for scalable applications in omics data analysis. The proposed evaluation framework thus serves as a valuable tool for method selection, particularly relevant when dealing with large-scale omics datasets where computational resources and accuracy are both critical considerations.
Related Publications
Explainable Artificial Intelligence (XAI)
Complex machine learning models perform better. However, we consider these models as black boxes. That’s where Explainable AI (XAI) comes into play. Understanding why a model ma...
Multivariable association discovery in population-scale meta-omics studies
It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to thei...
This Looks Like That: Deep Learning for Interpretable Image Recognition
When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or anot...
All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously
Variable importance (VI) tools describe how much covariates contribute to a prediction model's accuracy. However, important variables for one well-performing model (for example,...
Definitions, methods, and applications in interpretable machine learning
Significance The recent surge in interpretability research has led to confusion on numerous fronts. In particular, it is unclear what it means to be interpretable and how to sel...
Publication Info
- Year
- 2025
- Type
- article
- Citations
- 0
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.64898/2025.12.06.692723