Systems for grading the quality of evidence and the strength of recommendations I: Critical appraisal of existing approaches The GRADE Working Group

David C. Atkins(Agency for Healthcare Research and Quality), Martin Eccles(Newcastle University), Signe Flottorp(Nasjonalt Kunnskapssenter for Helsetjenesten), Gordon Guyatt(McMaster University), David Henry(Calvary Mater Newcastle Hospital), Suzanne Hill(Calvary Mater Newcastle Hospital), Alessandro Liberati(Azienda Ospedaliero-Universitaria di Modena), Dianne L. O’Connell(Cancer Council NSW), Andrew D Oxman(Nasjonalt Kunnskapssenter for Helsetjenesten), Bob Phillips(Warneford Hospital), Holger J. Schünemann(University at Buffalo, State University of New York), Tessa Tan-Torres Edejer(World Health Organization), Gunn Elisabeth Vist(Nasjonalt Kunnskapssenter for Helsetjenesten), John W Williams(Duke Medical Center), The GRADE Working Group
BMC Health Services Research
December 22, 2004
Cited by 1,226Open Access
Full Text

Abstract

BACKGROUND: A number of approaches have been used to grade levels of evidence and the strength of recommendations. The use of many different approaches detracts from one of the main reasons for having explicit approaches: to concisely characterise and communicate this information so that it can easily be understood and thereby help people make well-informed decisions. Our objective was to critically appraise six prominent systems for grading levels of evidence and the strength of recommendations as a basis for agreeing on characteristics of a common, sensible approach to grading levels of evidence and the strength of recommendations. METHODS: Six prominent systems for grading levels of evidence and strength of recommendations were selected and someone familiar with each system prepared a description of each of these. Twelve assessors independently evaluated each system based on twelve criteria to assess the sensibility of the different approaches. Systems used by 51 organisations were compared with these six approaches. RESULTS: There was poor agreement about the sensibility of the six systems. Only one of the systems was suitable for all four types of questions we considered (effectiveness, harm, diagnosis and prognosis). None of the systems was considered usable for all of the target groups we considered (professionals, patients and policy makers). The raters found low reproducibility of judgements made using all six systems. Systems used by 51 organisations that sponsor clinical practice guidelines included a number of minor variations of the six systems that we critically appraised. CONCLUSIONS: All of the currently used approaches to grading levels of evidence and the strength of recommendations have important shortcomings.


Related Papers

No related papers found

Powered by citation graph analysis