Our response to COVID-19: roundup of learning resources, news, published research

DBEI

Daniel Baer

A Bayesian Linear Mixed Model for Bi-Level Feature Selection

Click to View

Presenter

baerd@upenn.edu

Daniel Baer is a postdoctoral fellow in Dr. Sharon Xie's lab. Daniel Baer's research interests include developing models for longitudinal data analysis, feature selection, and measurement error as motivated by complexities arising in the study of neurodegenerative diseases.

Authors

D Baer¹, A Lawson², Y Park³, S Xie¹, A Benitez⁴

University of Pennsylvania, Department of Biostatistics, Epidemiology and Informatics
Medical University of South Carolina, Department of Public Health Sciences
University of Wisconsin–Madison, Department of Biostatistics & Medical Informatics
Medical University of South Carolina, Department of Neurology

Abstract

Alzheimer’s disease (AD) is a neurodegenerative disease with an increasing rate of prevalence in the United States. There is currently no cure for AD, and there is therefore interest in characterizing AD in order to develop disease modifying therapies. Characterization of AD can be facilitated by Bayesian feature selection models, which allow us to identify (possibly high-dimensional) patient feature data that are associated with longitudinal AD outcome data (e.g., cognitive scores over time). However, current Bayesian feature selection models are limited by salient complexities arising in the study of longitudinal AD outcome data. In particular, there are no Bayesian feature selection models that can simultaneously account for irregularly-spaced longitudinal outcome data, account for feature data group structure, and specify time-varying feature parameters. Accounting for these complexities can lead to a feature selection model with superior performance. We therefore developed a Bayesian linear mixed model for feature selection which addresses these complexities. We applied our novel approach to analyze longitudinal cognitive scores in the Alzheimer's Disease Neuroimaging Initiative participants with multimodal feature data, including neuroimaging, cerebrospinal fluid biomarkers, genetic markers, neurological diagnoses, and demographics. We found that our model identifies a parsimonious subset of patient feature data associated with rate of cognitive decline, and moreover by accounting for these aforementioned complexities, provides improved precision of feature parameter estimates. Our model therefore represents an effective tool that researchers can use to perform feature selection given complexities arising in the longitudinal study of AD.

Keywords

Feature selection, longitudinal data analysis, Bayesian model, Alzheimer's Disease.

Comments

Daniel, thank you for the presentation. Nice work!
Questions: In your comparison model that does not account for irregularly spaced outcomes, does it force a fixed time or is there loss of data? How do the features your model identified compare to the literature? Were there new features?

Posted by Knashawn Morales on Wed, 04/27/2022 - 10:45am

Hi Knashawn,

Thank you for your thoughtful questions.

The competing model discretizes the continuous measurement times associated with the longitudinal outcome data. So there is definitely a loss of data. This is salient as the correlation of longitudinal outcome data is a function of time separation. Therefore a feature selection model which can account for irregularly spaced longitudinal outcome data is advantageous.

The selected features from our analysis of the ADNi data were consistent with the AD literature. For instance, hippocampus volume and CSF TAU were selected as features that were most associated with longitudinal measures of AD risk.

No new features were selected re: modeling the ADNI data. However we found our model was advantageous in terms of providing improved precision re: the selected feature parameter estimates.

Posted by Daniel Baer on Wed, 04/27/2022 - 11:53am

About Us

To understand health and disease today, we need new thinking and novel science —the kind we create when multiple disciplines work together from the ground up. That is why this department has put forward a bold vision in population-health science: a single academic home for biostatistics, epidemiology and informatics.

DBEI

Research

Daniel Baer

A Bayesian Linear Mixed Model for Bi-Level Feature Selection

Presenter

Authors

Abstract

Keywords

Comments

About Us

Follow Us

Daniel Baer

A Bayesian Linear Mixed Model for Bi-Level Feature Selection

Presenter

Research Links

Authors

Abstract

Keywords

Comments

About Us

Search

Follow Us