Screening for depression with the Patient Health Questionnaire-2 (PHQ-2) alone and in combination with the PHQ-9: individual participant data meta-analysis

2020 Abstracts

Levis B¹, Sun Y¹, He C¹, Wu Y¹, Krishnan A¹, Bhandari PM¹, Neupane D¹, Imran M¹, Brehaut E¹, Negeri ZF¹, Fischer FH², Benedetti A¹, Thombs BD¹

¹McGill University

²Charité - Universitätsmedizin Berlin

Background: The Patient Health Questionnaire-2 (PHQ-2) depression screening tool includes items that assess frequency of depressed mood and anhedonia in the past two weeks. It can be used alone or as a first step to identify patients for evaluation with the full Patient Health Questionnaire-9 (PHQ-9). Meta-analyses on PHQ-2 accuracy have been limited by including only published data and by not examining accuracy for different reference standards, in participant subgroups, or in combination with the PHQ-9, as it is commonly used in practice. Individual participant data meta-analysis (IPDMA), which synthesizes participant-level data from primary studies rather than summary results from study reports, has the potential to overcome these challenges.

Objectives: To use IPDMA to evaluate the accuracy of the PHQ-2 alone and in combination with the PHQ-9 for screening to detect major depression.

Methods: Medline, Medline In-Process & Other Non-Indexed Citations, PsycINFO, and Web of Science were searched from Jan 1, 2000 to May 9, 2018 for datasets that compared PHQ scores to major depression classification based on a validated diagnostic interview. Bivariate random-effects meta-analysis was used to estimate sensitivity and specificity compared to semi-structured, fully structured (Mini International Neuropsychiatric Interview [MINI] excluded), and MINI diagnostic interviews, separately, and in participant subgroups based on age, sex, country human development index and recruitment setting.

Results: Individual participant data were obtained from 100 of 136 eligible studies (44,318 participants, 4,572 major depression cases). Among studies that used semi-structured interviews, PHQ-2 sensitivity and specificity were 0.91 and 0.67 for cutoff ≥2 and 0.72 and 0.85 for cutoff ≥3. Sensitivity was significantly greater for semi-structured versus fully structured interviews. Specificity was not significantly different across interviews. There were no significant differences in accuracy across subgroups. For semi-structured interviews, sensitivity for PHQ-2 ≥2 followed by PHQ-9 ≥10 was not significantly different than for PHQ-9 ≥10 alone (0.82 versus 0.86); specificity was significantly but minimally higher (0.87 versus 0.85). The combination reduced the number of participants needing to complete the full PHQ-9 by 57%.

Conclusions: PHQ-2 ≥2 followed by PHQ-9 ≥10 had similar accuracy as PHQ-9 ≥10 alone and reduced the proportion of participants needing to complete the full PHQ-9 by 57%.

Patient or healthcare consumer involvement: There was no direct patient or consumer involvement in this project. However, clinicians considering screening for depression with the PHQ alone or in combination with the PHQ-9 can refer to our web-based knowledge translation tool: depressionscreening100.com/phq-2, which estimates expected numbers of positive screens and true and false screening outcomes based on results from the present IPDMA.