Abstract
Abstract Background Causal structure learning offers a promising approach to studying gene regulation in cells, aiming to provide deeper mechanistic insights than purely association-based methods. Theoretical groundwork indicating that interventions improve identifiability of causal structure motivates the use of causal structure learning methods in scenarios with interventional information. Results This benchmark investigates the ability of existing causal structure learning algorithms to leverage the information revealed by targeted interventions to infer gene regulatory networks (GRNs). In this study both synthetic and experimental single-cell CRISPR perturbation data is leveraged, and a suite of causal structure learning algorithms, is evaluated on metrics tailored to synthetic ground truth and real biological data respectively. On synthetic data, accurate recovery of GRNs is achieved under favourable conditions: strong interventions, large sample sizes, and low measurement noise. However, on real data, performance remains unreliable, limited by technical and biological noise, as well as algorithmic scalability. This highlights a current gap between theoretical potential and practical application of causal structure learning for GRNs. Conclusions The benchmark provides insight into algorithm strengths and limitations and offers groundwork for further methodological development. We also provide an accessible software package to leverage modern causal structure learning on custom datasets and thereby foster future exploration of the potential of causal structure learning.
Affiliated Institutions
Related Publications
Functional association networks as priors for gene regulatory network inference
Abstract Motivation: Gene regulatory network (GRN) inference reveals the influences genes have on one another in cellular regulatory systems. If the experimental data are inadeq...
A Survey on Multi-Task Learning
Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the genera...
Recovering Gene Interactions from Single-Cell Data Using Data Diffusion
Single-cell RNA sequencing technologies suffer from many sources of technical noise, including under-sampling of mRNA molecules, often termed "dropout," which can severely ...
Statistical Learning Theory
A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis...
Clustering data streams: theory and practice
The data stream model has recently attracted attention for its applicability to numerous types of data, including telephone records, Web documents, and clickstreams. For analysi...
Publication Info
- Year
- 2025
- Type
- article
- Citations
- 0
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.64898/2025.12.05.692565