The Endocrine Society Guidelines: when the confidence cart goes before the evidence horse?

2013 Québec City

Brito JP¹, Domecq JP¹, Murad MH², Guyatt GH³, Montori VM¹

¹Division of Endocrinology, Diabetes, Metabolism, Nutrition, Mayo Clinic, Rochester, MN

²Division of Preventive, Occupational and Aerospace Medicine, Mayo Clinic, MN

³Department of Medicine, McMaster University, Hamilton, Ontario L8N 3Z5, Canada

Background: In 2005, the Endocrine Society (TES) adopted the GRADE system of developing clinical practice guidelines. This system rates the panel’s confidence in the estimates of effect of the available options (from high to very low confidence) and in the value of following a recommendation (strong or conditional recommendation). GRADE working group guidance suggests that strong recommendations based on low or very low (l/vl) confidence may often be inappropriate.

Objectives: We sought to characterize TES strong recommendations based on l/vl confidence evidence.

Methods: We identified all strong recommendations based on l/vl confidence evidence published in TES guidelines between 2005 (when TES started using GRADE) and 2011. We applied a taxonomy of paradigmatic situations in which strong recommendation based on l/vl confidence estimates may be appropriate. Independently and in duplicate, reviewers extracted, for each recommendation, whether a strong recommendation was appropriate and if so which paradigmatic situation applied.

Results: 206 (58%) of the 357 TES recommendations issued were strong; of these, 121 (59%) were based on l/vl confidence evidence. Of these 121, we classified 43 (36%) as ‘best practice’ recommendations for which sensible alternatives do not exist and do not require grading . In 5 (4%), we concluded that moderate or high confidence in estimates was warranted and in another 5 (4%) that the recommendations were for ‘additional research’. Of the remaining 67, 33 (27% of the original 121) were judged inappropriate. Of the 35 appropriate ones, the most common situation (13, 37%) was low confidence evidence for benefit and high confidence evidence for harm thus warranting a strong recommendation against the intervention.

Conclusions: Guideline panels should beware of formulating inappropriately strong recommendations when confidence in estimates is low. Our taxonomy of paradigmatic situations when such recommendations are appropriate may be helpful.