Towards Trustworthy Machine Learning: A Causal Lens on Learning Non-Spuriousness
Hengrui Cai, PhD
Assistant Professor of Statistics
Donald Bren School of Information
and Computer Sciences | UCI
WHEN: Wednesday, January 8, 2025, from 3:30 to 4:30 p.m.
WHERE:听Hybrid | 2001 麻豆区 College Avenue, Room 1140;
NOTE:听Hengrui Cai will be presenting virtually
Abstract
The causal revolution has spurred interest in understanding complex relationships across various fields. Most existing methods aim to discover causal relationships among variables within a complex, large-scale system. However, in practice, only a small number of variables are relevant to the outcomes of interest. As a result, causal estimation using the full causal representation, especially with limited data, could lead to many falsely discovered, spurious variables that are highly correlated with but have no causal impact on the target outcome. We propose learning a class of necessary and sufficient causal graphs that only contain causally relevant variables, utilizing probabilities of causation. The framework is further extended to natural language processing models to disentangle the 'black box' by identifying true rationales when two or more snippets are highly inter-correlated, thus contributing similarly to prediction accuracy. We leverage two causal desiderata, non-spuriousness and efficiency, establishing their theoretical identification as the main component of learning necessary and sufficient in language models. The superior performance of our proposed methods is demonstrated in real-world reviews and medical datasets through extensive experiments.
听
Speaker bio
Dr. Hengrui Cai is an Assistant Professor in the Department of Statistics in the DonaldBren School of Information and Computer Sciences at the University of CaliforniaIrvine. She obtained her Ph.D. degree in Statistics at North Carolina State University in2022. Cai has broad research interests in methodology and theory in causal inference,reinforcement learning, and graphical modeling, to establish reliable, powerful, andinterpretable solutions to real-world problems. Currently, her research focuses oncausal inference and causal structure learning, natural language processing andexplainable deep learning, and policy optimization and evaluation in reinforcementlearning and bandits, with applications in precision medicine. Her work has beenpublished in journals including the Journal of the American Statistical Association,Journal of Machine Learning Research, and Statistics in Medicine, as well asconferences including NeurIPS, ICML, and ICLR. Please visit her personal website: .
听