An Approach for Estimating and Evaluating Optimal Treatment Decision Rules when Data are Missing at Random
Jenny Shen is a third-year Biostatistics PhD student at the University of Pennsylvania advised by Dr. Kristin Linn and Dr. Rebecca Hubbard. Her research interests include statistical methods for personalized medicine, clinical trials, and high-dimensional data.
Individual patients, care providers, and other stakeholders can benefit from the development and implementation of data-driven optimal treatment strategies. Optimal treatment regimes can improve outcomes by maximizing a population-level distributional summary such as the expected value of a clinical outcome. Guidance for estimating optimal decision rules in the presence of missing data is fairly limited, as the majority of existing methods rely on having a complete set of data that are observed. We propose to combine multiple imputation with model averaging to estimate and evaluate optimal decision rules from data with missingness at random. We use simulations to study the performance of our proposed framework. To illustrate our methods, we perform a secondary analysis of data from the Social Incentives to Encourage Physical Activity and Understand Predictors (STEP UP) trial. The STEP UP trial was a randomized trial comparing a control to three active interventions that aimed to increase daily step counts among employees at a large professional services company. Due to missingness in the longitudinal step count outcome, the main trial analysis applied multiple imputation to estimate the average treatment effect of each intervention. In our secondary analysis, we estimate and evaluate an optimal decision rule from multiply imputed data that aims to maximize the average step count over a twelve week period. Using our analysis of STEP UP data as an example, we provide guidance on reproducible inference for optimal decision rules estimated from data with missingness.
KeywordsIndividualized Treatment Regime, Decision Rule, Multiply Imputed Data
To understand health and disease today, we need new thinking and novel science —the kind we create when multiple disciplines work together from the ground up. That is why this department has put forward a bold vision in population-health science: a single academic home for biostatistics, epidemiology and informatics. LEARN MORE ABOUT US
Can you provide some context for the Value estimates in Figure 3? Is it the expected step count under a particular decision rule? What does it mean when the Value under the "optimal" rule is less than under the competitive rule?
The Value estimates for the optimal rule were estimated with a decision rule based on a selected set of predictors and binary treatment (competitive or collaborative, which were the most relevant arms based on previous analyses of the STEP UP trial). For the Value estimates of "Competitive" or "Collaborative," these were the expected step counts if we were to assign everyone to just the competitive arm or just the collaborative arm. Individuals who had been assigned to the competitive arm in the STEP UP trial had the greatest number of steps, which is reflected in the high expected step counts for "Competitve." One of the variables we considered in estimating our decision rule was baseline step count, so it's likely that this variable (and perhaps others) are driving lower Value estimates under the optimal rule -- which is something we're exploring. Furthermore, since we restricted our class of possible decision rules to be linear in the predictors, it might be that the true optimal rule is more complicated and not approximated well by a linear model. These results may also suggest that the features available to us for tailoring do not capture important signal regarding differential response to the treatments. That would lead to overfitting to noise in the training data that doesn’t generalize well to the test data, leading to suboptimal performance of the tailored rule compared to just giving everyone competitive.