Assessing methodological quality in a large research literature: the case of distance education comparison studies

Article type
Authors
Bernard R, Abrami P, Wade A
Abstract
It goes without saying that the quality of a systematic review turns on the quality of the studies that are included in it. In strong literatures, where randomized trials are both possible and endemic to the research culture, it makes perfect sense to select only studies of the highest methodological quality and to exclude the rest. Conclusions, then, derive from strong evidence. However, in a practitioner-oriented field such as Education, studies are often conducted in less than ideal experimental circumstances (e.g., intact classrooms) with less than ideal measures and controls for violations of external validity. In fact, when fewer than 10% of studies in a literature include random assignment as a baseline condition, and only an additional 15% include a pretest, as is the case with our recent meta-analysis of the comparative literature of distance education and classroom instruction, the question arises whether to exclude 75% to 90% of the studies or to include all studies that meet basic conditions of comparison and effect size calculation, and to code for methodological quality. The literature that we reviewed contained over 5,000 usable abstracts, retrieved through targeted searches of 14 electronic bibliographic databases, Internet searches, hand searches of selected journals and conference proceedings and branching. Examination of 862 full-text manuscripts netted 232 acceptable studies of distance education compared to classroom instruction. Effect sizes were calculated or estimated and 13 methodological study features were coded, along with an additional 40 study features related to demographics, pedagogy, media use and institutional conditions. One of our first assessments was in terms of the completeness of the literature: how much codable information was contained in studies. Overall, we found that about 35% of the data relating to methodological quality was missing. In particular, large amounts of information about control for selection bias and the equality of materials used, student time on task, student ability, gender, student attrition and class size were missing. Missing data were especially troublesome in conducting multiple regression analyses of study features. In order to retain a sufficient number of variables to reasonably estimate the effects of methodological quality on effect sizes, we recoded missing values to zero and recoded equivalence or non-equivalence as +2 or 2 to reflect these extremes. Even reducing the data in this way, we found that methodology accounted for as much as 50% of the variance in effect sizes. This led us to surmise that methodological quality in this literature is weak and is directly implicated in assessing other blocks of substantive study features such as pedagogy and media use. We concluded that it is reasonable to choose study feature coding of methodology over exclusion based on methodology when a substantial portion of the literature under consideration lacks reasonable control for various forms of validity. However, the success of this approach depends in large measure on accurate and comprehensive reporting of the methodological characteristics of studies. We discuss the costs and benefits of this approach and the possible impact on researchers, policy-makers and practitioners.