Treatment effect sizes vary in randomized trials depending on type of outcome measure

2019 Santiago

Berthelsen DB¹, Ginnerup-Nielsen E¹, Juhl C², Lund H³, Henriksen M⁴, Hróbjartsson A⁵, Nielsen SM¹, Voshaar M⁶, Christensen R⁷

¹The Parker Institute, Bispebjerg and Frederiksberg Hospital, Copenhagen

²Research Unit of Musculoskeletal Function and Physiotherapy, Institute of Sports Science and Clinical Biomechanics, University of Southern Denmark, Odense and Dep. of Physiotherapy and Occupational Therapy, University Hospital of Copenhagen, Gentofte

³Centre for Evidence-Based Practice, Western Norway University of Applied Sciences, Bergen

⁴The Parker Institute, Bispebjerg and Frederiksberg Hospital, Copenhagen and Department of Physical and Occupational Therapy, Bispebjerg and Frederiksberg Hospital, Copenhagen

⁵Center for Evidence-Based Medicine, University of Southern Denmark/Odense University Hospital, Odense

⁶Department of Psychology, Health and Technology, University of Twente, Enschede

⁷The Parker Institute, Bispebjerg and Frederiksberg Hospital, Copenhagen and Research Unit of Rheumatology, Department of Clinical Research, University of Southern Denmark, Odense University Hospital

Background: Patient-Reported Outcome Measures (PROMs) yield insightful information when assessing treatment effect in clinical practice; however, they may overestimate effects, particularly in non-pharmacological studies.

Objectives: to compare estimated treatment effects of physical therapy (PT) between PROMs and outcomes measured in other ways.

Methods: we selected randomized trials of PT with both a PROM and a non-PROM included in Cochrane Systematic Reviews (CSRs). Two review authors independently extracted data and 'Rsk of bias' assessments. Our primary outcome was the ratio of odds ratios (ROR), used to quantify how effects vary between PROMs and non-PROMs; an ROR greater than 1 indicates larger effect when assessed by PROMs. We used REML methods to estimate associations of trial characteristics with effects and between-trial heterogeneity.

Results: from 90 relevant CSRs, we included 205 PT trials. The summary ROR across all the comparisons was not statistically significant (ROR 0.88, 95% confidence interval (CI) 0.70 to 1.12; P = 0.30); however, the heterogeneity was substantial (I2 = 88.1%). When stratifying non-PROMs further into clearly objective non-PROMs (e.g. biomarkers) and other non-PROMs (e.g. aerobic capacity), the PROMs appeared more favourable than did clearly objective non-PROMs (ROR 1.92, 95% CI 0.99 to 3.72; P = 0.05). On the contrary, patients’ own report of treatment effects appeared less favourable when compared to less objective non-PROMs (ROR 0.80, 95% CI 0.62 to 1.02; P = 0.07). When outcomes reflected the same construct, PROMs appeared less favourable than comparable non-PROMs (ROR 0.29, 95% CI 0.15 to 0.55; P < 0.001).

Conclusions: estimated treatment effects based on PROMs are generally comparable to treatment effects measured in other ways. However, in our study, PROMs indicate a more favourable treatment effect compared to treatment effects based on clearly objective outcomes, and a less favourable treatment effect when compared to less objective non-PROMs. Likewise, PROMs indicated a less favourable treatment effect when outcomes were based on the same construct. Patients' and clinicians' different perspectives on a disease may influence estimates of treatment effects in randomized trials, and including other instruments/measures together with PROMs should be considered in clinical practice and when developing core outcome measurement sets in various conditions.

Patient or healthcare consumer involvement: three patient research partners (PRPs) were involved in designing the study and discussing the results. During protocol development, the PRPs participated in discussions of relevance, content, and ethics. Each PRP wrote a hypothesis of expected results, and all these three hypotheses were in agreement with the original hypothesis. During subsequent discussions of results, the PRPs gave their perspective on findings based on their experiences of pros and cons when assessing treatment effects using PROMs and other measurements respectively.