To share or not to share data: how valid are copious randomized controlled trials?

Article type
Authors
Bordewijk E1, Wang R2, van Wely M1, Costello M3, Norman R4, Teede H2, Gurrin L5, Mol B2, Li W2
1Universitair Medisch Centrum (UMC), Amsterdam
2Monash University
3The University of New South Wales
4The University of Adelaide
5The University of Melbourne
Abstract
Background: Increasingly individual participant data (IPD) is being shared and integrated from randomized controlled trials (RCTs) for systematic reviews and other righteous purposes. Granting open access of data has implications for the promotion of fair and transparent conduct of RCTs, which is crucial when arguing for reproducibility in research. It is, however, still common for authors to choose to withhold IPD, limiting the impact of and confidence in the results of RCTs and systematic reviews based on aggregate data. In our recent IPD meta-analysis evaluating the effectiveness of first-line ovulation induction for polycystic ovary syndrome (PCOS), IPD was only available from 20 RCTs whereas IPD from 34 RCTs was not available. We found that the summary effect sizes of meta-analyses of RCTs not providing IPD were different from those of RCTs that provided IPD. Several aggregate data meta-analyses have been performed on this topic.

Objectives: To understand if RCTs that did not share IPD have lower quality and more methodological issues than those that shared IPD in an IPD meta-analysis evaluating first-line ovulation induction for PCOS?

Methods: We assessed and compared the shared and non-shared IPD RCTs on the following criteria: risk of bias, GRADE approach, adequacy of trial registration; statistical issues (description of statistical methods and reproducibility of univariable statistical analysis); excessive similarity or difference in baseline characteristics that is not compatible with chance (Monte Carlo simulations and Kolmogorov-Smirnov test); and miscellaneous methodological issues.

Results: Overall, the non-shared RCTs had worse performance regarding the assessment of the risk of bias and the GRADE approach when compared to the shared RCTs. Adequate trial registration was found in 33% of the shared IPD RCTs versus 0% in the non-shared RCTs (p=0.012). In total, 7/17 (41%) shared RCTs and 19/28 (68%) non-shared RCTs had issues with the statistical methods described (p=0.079). The median (range) of inconsistency rate of univariable statistical results for the outcome(s) was 0 (0-0.63) (14 RCTs applicable) in the shared group and 0.44 (0-1) (24 RCTs applicable) in the non-shared group (p=0.0033).
The distribution of simulation generated p-values from all baseline continuous variables did not significantly violate the expected uniform distribution in the shared group (p= 0.1626), suggesting that these baseline characteristics are likely to be the results of proper randomization. However, it was significantly violated in the non-shared group (p=4.535*10^-8) (Figure 1).

Conclusions: The IPD meta-analysis on evaluating first-line ovulation induction for PCOS preserves better validity than meta-analyses using aggregate data. The availability of IPD might be a good indicator of the quality and methodological soundness of RCTs when performing systematic reviews.

Patient or healthcare consumer involvement: None.