Michael Harhay

A Bayesian Machine Learning Approach for Estimating Heterogeneous Survivor Causal Effects

No video available
Thumbnail of Poster PDF
Click to View


Default Presenter Image
Michael Harhay, Epidemiology

Michael Harhay, PhD, MPH is an Assistant Professor of Epidemiology and Medicine at the University of Pennsylvania. He is also a core faculty member in Penn’s Palliative and Advanced Illness Research (PAIR) Center, where he directs the PAIR Center Clinical Trials Methods and Outcomes Lab.


M Harhay1, F Li2, G Tong2, E Chen3

  1. Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania
  2. Department of Biostatistics, Yale School of Public Health
  3. Department of Mathematics and Statistics, Mississippi State University


Assessing heterogeneity in the effects of treatments has become increasingly popular in the field of causal inference and carries important implications for clinical decision-making. While extensive literature exists for studying treatment effect heterogeneity when outcomes are fully observed, there has been limited development of tools for estimating heterogeneous causal effects when patient-centered outcomes are truncated by a terminal event, such as death. Due to mortality occurring during study follow-up, the outcomes of interest are unobservable and undefined for specific subgroups of patients, therefore requiring the principal stratification framework to draw valid causal conclusions. Motivated by the Acute Respiratory Distress Syndrome Network ARMA trial, we developed a flexible Bayesian semiparametric approach to estimate the average causal effect and heterogeneous causal effects among the always-survivors stratum when clinical outcomes are subject to truncation. Under the proposed approach, we adopted Bayesian additive regression trees (BART) to flexibly specify separate models for the potential outcomes and latent strata membership. In the analysis of the ARMA trial, we found that the low tidal volume treatment had an overall benefit for patients sustaining acute lung injuries, but substantial heterogeneity in treatment effects among the always-survivors. We also demonstrated through a simulation study that the proposed Bayesian semiparametric approach outperforms other parametric methods in reducing the estimation bias in both the average causal effect and heterogeneous causal effects for individual patients identified as always-survivors.


Bayesian additive regression trees, causal inference, heterogeneity of treatment effects, principal stratification, truncation by death

About Us

To understand health and disease today, we need new thinking and novel science —the kind  we create when multiple disciplines work together from the ground up. That is why this department has put forward a bold vision in population-health science: a single academic home for biostatistics, epidemiology and informatics. 

© 2023 Trustees of the University of Pennsylvania. All rights reserved.. | Disclaimer

Follow Us