Predicting Who’s at Risk: Electronic Health Records and a Polygenic Approach May Be Key

Predicting Who’s at Risk: Electronic Health Records and a Polygenic Approach May Be Key

April 2020

Why do some people carry a disease like COVID-19 and remain asymptomatic, while others who have no underlying conditions may die from it? Genetics may be essential to understanding these differences, and research on genetic variations that make us susceptible to disease and influence how our bodies handle it has grown over the past two decades. But most of the statistically significant genetic variants we have found via genome-wide association studies (GWAS), the most-used tool, have only small effects on disease risk. Indeed, many diseases are associated with hundreds or thousands of genetic variants. The polygenic risk score (PRS)—an aggregate measure of many genetic variants, weighted by their individual effects on a given phenotype—presents great opportunities to develop genetic insights we can apply to patients in the clinical setting, especially given that electronic health records (EHRs) are increasingly being linked to patient genetic data in biobanks. But we still don’t have full answers as to whether and how EHR data can be used to predict risk.

A new paper in Nature Reviews lays out a path for researchers to follow as they seek to capitalize on the big potential to predict risk via PRS, using data on genotype (an individual’s collection of genes) and phenotype (their observable characteristics) from biobank-linked EHRs. “This is an exciting time, because we can measure variability in the human genome—which encompasses all genes and their inter-relationships—and model the disease risk associated with those variations," said senior author Jason Moore, PhD. “Once coronavirus risk in human populations has been assessed, a PRS could be used to identify which individuals are at highest risk of infection, and which infected people are most likely to be hospitalized or die. With access to clinical data from EHRs and genetic data from biobanks, it will be possible to deliver these results to the patients and their caregivers.”

The opportunities at Penn are especially promising, given the growing resources of the Penn Medicine BioBank (PMBB), which houses many annotated blood and tissue samples. “The PMBB was established precisely for this reason,” said co-author Marylyn Ritchie. “Our goal is to measure genomic variability in our patients and link those genetic testing results to their clinical records stored in the EHR. This will not only advance cutting-edge genetics research at Penn Medicine, but will also make it easier to deliver those results.” Further, in parallel with the PMBB effort, Penn researchers have created new software within PennChart, the Penn Medicine EHR, specifically to deliver genetic testing results to clinicians so that they can make more informed decisions about patient care. “We’ve invested heavily over the past 10 years in PennChart, PMBB, and numerous other information technologies that make these exciting studies possible, “ said Michael Restuccia, Penn Medicine’s senior vice president and chief information officer (CIO) . “It’s very rewarding to see all of our hard work and planning pay off to advance key research discoveries that improve the health of our patients.”

The authors acknowledge that population-based studies have also yielded insights via PRS, but note that these studies are expensive and complicated to run. The EHR, they add, makes data more readily available and provides a wider array of phenotypes that embrace all diagnoses, tests and images for each patient. The EHR also includes some environmental variables: Does the patient smoke, for instance? Researchers can derive some additional environmental data using the patient’s home address; and the authors suggest that we need to enable more—including the patient’s socio-economic status and the quality of the air they breathe—by linking to census data.

Next steps, they say, include validating PRS in independent data sets. “We are at a good place to have an impact here at Penn Medicine, but we still have more work to do,” said Ruowang Li, PhD, the study’s lead author. “An important challenge we face is how to share data and research results with other medical centers around the country, to replicate our findings and to make new discoveries. The good news is that we have a team of biomedical informaticians and biostatisticians in our Department of Biostatistics, Epidemiology and Informatics (DBEI)—including our co-author Yong Chen—who are working on computer algorithms and statistical methods for combining results across institutions, while preserving the privacy of our patients.” There is potential to use PRS against the patient, the authors write, and regulation is needed. They add that it will be important to consider how ethnic disparities in PRS may limit how much it can be generalized.

They also seek to enlist clinicians who can evaluate PRS in the healthcare setting: When does the PRS contradict the clinician’s diagnosis, and why? The process will be dynamic, said Dr. Moore, and it illustrates why this is an exciting time to be a researcher at Penn Medicine. “My goal for the next 10 years is to automate much of the EHR-based discovery process, so that our investigators can greatly accelerate moving research results into the clinic,” he commented. We’re all part of a learning health system that is continuously improving the health of our patients and communities.”

Read the article in Nature Reviews Genetics.

About Us

To understand health and disease today, we need new thinking and novel science —the kind  we create when multiple disciplines work together from the ground up. That is why this department has put forward a bold vision in population-health science: a single academic home for biostatistics, epidemiology and informatics. 

© 2023 Trustees of the University of Pennsylvania. All rights reserved.. | Disclaimer

Follow Us