Andrew Boyd co-I on project awarded $419,357 NIH R21 grant

MPIs: Yang Dai (Engineering) and Beatriz Peñalver Bernabé (Engineering and Medicine)

co-Is: Pauline Maki (Medicine) Sage Kim (Public Health), Andrew Boyd(Applied Health Sciences)

Funding agency: R21 NICHD

Total awarded amount: $419,357

Project period: 09/2023-08/2025

Title: Integration of electronic medical records and neighborhood contextual indicators into machine learning strategies for identifying pregnant individuals at risk of depression in underserved communities

Abstract: The goal of this proposal is to optimize the use of computational methods using electronic medical records (EMRs), such as machine learning (ML) models, to predict depression during pregnancy and the first year postpartum (perinatal depression, PND) in Minoritized Women of Color. Most ML models forecast postpartum depression (PPD) based on EMR from middle class Non-Hispanic White individuals. However, our results show that Non-Hispanic Black Women (NHBW) have higher rates of depression (23% versus the 12% US average) and depression during early pregnancy in NHBW is far more common than PPD.  Here, we propose to optimize the application of ML models to PND in three keyways. First, we will use bias-mitigation approaches, to limit what it is called model prediction performance bias, defined as the disparate model prediction outcome with respect to certain socio-demographic variables, such race/ethnicity or age. Second, we will develop ML models that can offer interpretable outcomes and provide insights for clinical interventions. ML models are often “black boxes”, making it difficult to know the direction and magnitude of variables associated with the model outcome. Third, current EMR-based ML models to predict PND rarely include community social determinants of health (SDoH). SDoH both at the individual-level (e.g., racial minority, poverty) and at the neighborhood-level (e.g., violence, access to care) have been linked with increased risk of PND. NHBW are disproportionally affected by the negative health impacts of SDoH, including higher risk of PND and preterm birth. Despite their importance, SDoH have not been considered in assessing risk of PND using ML models, particularly among Minority Women of Color who experience disproportionate burden of social and economic hardship. This limits the model prediction performance in women who are exposed to higher contextual risks. We hypothesize that interpretable ML models trained on sufficient numbers of EMR records from Minoritized Women of Color and that integrate neighborhood-level contextual factors (a proxy for community-level stressors) can substantially improve the prediction of PND in women at higher risk. We aim to establish a robust and interpretable ML framework that combines individual- and community-level SDoH to predict PND for Minoritized Women of Color who have been rarely represented in data modeling. Our long-term vision is to integrate our interpretable ML model into routine clinical care for early detection, diagnosis, and treatment of PND. We will capitalize on large urban OB/GYN clinics (>70,000 patients) primarily serving Minoritized Women of Color (50% NHBW, 30% Latinas) living in the Chicago area. Neighborhood contextualized information will be obtained from the US Census Bureau and the Chicago Health Atlas. In Aim 1, we will develop interpretable ML models to predict PND in at-risk women using EMRs. In Aim 2, we will also incorporate neighbor-level SDoH factors into model building. Our innovative and interpretable prediction models informed by EMR, and neighborhood contextual data could be leveraged in clinical care to identify women more accurately at greatest risk of PND and by informing preventive intervention.