PCA Structured laSSO (PCASSO)
PresenterFlash Talk Presenter
I am a second-year Biostatistics PhD student working under the guidance of Dr. Jarcy Zee. My primary research project, which is depicted by this presentation, is developing a novel scalar-on-matrix regression method to predict clinical outcomes from computer-extracted features of image-segmented objects. Concurrently, I am working on a second project regarding the development an algorithm to automatically identify potential functional forms of continuous predictors for machine learning models
Nephrotic syndrome (NS) characterizes a group of rare diseases that can cause chronic kidney disease and kidney failure. Whole slide images (WSIs) of kidney biopsies from NS patients were collected through the Nephrotic Syndrome Study Network (NEPTUNE). These images offer greater insight into disease prognosis through pathomic feature extraction for biomarker discovery. Pathomic features are computer-generated quantitative measurements that are calculated from segmented histological objects which quantify the objects’ heterogeneity. For each subject, we can construct a matrix whose entries are a common set of pathomic features that are measured per segmented histological object from that subject’s WSI. We propose the principal component analysis (PCA) Structured laSSO (PCASSO), a novel scalar-on-matrix regression technique that allows for varying numbers of segmented histological objects across subjects, to predict scalar clinical outcomes from the pathomic feature matrices. Specifically, we consider the setting in which there is a large number of segmented histological objects per subject relative to the number of pathomic features. Simulation study results indicate that PCASSO best identifies the pathomic features which truly affect clinical outcomes relative to naive regression modelling strategies. The application of PCASSO for pathomic feature-based prediction of clinical outcomes contributes to the ultimate goal of personalized care for NS patients based on individual characteristics.
KeywordsKidney disease, high-dimensional data analysis, image analysis, computational pathology, scalar-on-matrix regression
To understand health and disease today, we need new thinking and novel science —the kind we create when multiple disciplines work together from the ground up. That is why this department has put forward a bold vision in population-health science: a single academic home for biostatistics, epidemiology and informatics.