Inter-reviewer agreement: an analysis of the degree to which agreement occurs when using tools for the appraisal, extraction and meta-synthesis of qualitative research findings

2005 Melbourne

Florence Z, Schulz T, Pearson A

Background: Methods for systematic review of qualitative studies are currently emerging. Within the evidence review community, there is scepticism in relation to the extent to which such review results will be reproducible (and some would argue whether that they should be reproducible). To date few 'qualitative' reviews have been published and it is difficult to judge the integrity of this emerging methodology. Inter-reviewer agreement (in quantitative work referred to as inter-rater reliability) is about the consistency (or otherwise) of interpretation and thus the authenticity (validity in quantitative work) of the synthesis. The extraction and synthesis of qualitative research findings is arrived at through an interpretive process. Interpretive authenticity is best demonstrated when multiple reviewers from different cultures and settings generate synthesised findings from the same studies with clear similarity in meaning.

Objectives: The Joanna Briggs Institute has conducted training programs in systematic review methods including the review of qualitative studies using the Joanna Briggs Institute Qualitative Assessment and Review Instrument software in the UK, Spain, the USA, Canada, Thailand, Hong Kong, China and Australia. As part of the training, participants worked as pairs in conducting blinded critical appraisal followed by process of reaching agreement through conferring; extracting qualitative findings; and conducting a process of meta-synthesis on two qualitative studies. These studies were reviewed by 18 pairs of reviewers from diverse cultures and contexts. The results of the meta-synthesis exercise were analysed to identify the degree to which inter-reviewer was achieved between these 18 pairs.

Methods: The authors assembled the findings, categories and synthesised findings generated by all 18 paired participants using JBI QARI and compared them in terms of similarity in meaning.
Results: In spite of the differences in background, the similarity in meaning of the synthesised findings across the participant pairs was striking. There was remarkable consistency within and between groups.

Conclusions: Although the methods of conducting reviews of qualitative data are only emerging, it is encouraging that this preliminary analysis suggests that the use of a systematic process of extracting, categorising and synthesising the findings of qualitative studies led to results that appeared to be reproducible.

PDF

1208-1208.pdf