Evaluating the impact of gender on performance of automated population identification from electronic health records

2019 Santiago

Levinson R¹, Malinowski J², Bielinski S³, Rasmussen L⁴, Wiley L⁵

¹Vanderbilt University Medical Center

²Write Inscite, LLC

³Mayo Clinic

⁴Northwestern University

⁵University of Colorado Anschutz Medical Campus

Background: an important challenge in using electronic health records (EHRs) for population health research and clinical evidence generation is the identification of the patient population in which to conduct a study. To address this, methods of automated population identification (computational phenotyping, natural language processing, etc.) are often used to identify patient groups. The performance of these methods is evaluated by statistical comparison to manual review of patient records. It is well known that women have different patterns of healthcare utilization and can present with different disease symptomatology than men, but it is unclear the extent to which these differences are represented in EHR data. Therefore, it also remains unknown whether automated population identification methods are propagating the effects of gender biases in clinical care.

Objectives: the objective of this systematic review is to identify the extent to which investigators consider the effect of gender in automated population identification methods and the potential impact on the performance of algorithms within each gender.

Methods: we will evaluate the presence and quality of population descriptions in EHR-based, computational approaches to patient identification, specifically related to disease, disease subtypes, and disease symptoms. Outcomes of interest include characteristics of the patient populations used for algorithm development and validation. Specifically, we will investigate the proportion of studies that evaluate the performance of their algorithm to capture the disease phenotype of interest accurately in women. We will evaluate further the impact of computational method on algorithm performance. This evidence review will identify potential gender biases in automated population identification, allowing future EHR-based studies to generate improved clinical evidence for women.

Patient or healthcare consumer involvement: patients were not involved with this study.