Levels of Evidence

From the Centre for Evidence-Based Medicine, Oxford

For the most up-to-date levels of evidence, see http://www.cebm.net/levels_of_evidence.asp

Therapy/Prevention/Etiology/Harm:

1a: Systematic reviews (with homogeneity ) of randomized controlled trials
1a-:Systematic review of randomized trials displaying worrisome heterogeneity
1b:Individual randomized controlled trials (with narrow confidence interval)
1b-:Individual randomized controlled trials (with a wide confidence interval)
1c:All or none randomized controlled trials
2a:Systematic reviews (with homogeneity) of cohort studies
2a-:Systematic reviews of cohort studies displaying worrisome heterogeneity
2b:Individual cohort study or low quality randomized controlled trials (<80% follow-up)
2b-:Individual cohort study or low quality randomized controlled trials (<80% follow-up / wide confidence interval)
2c:'Outcomes' Research; ecological studies
3a:Systematic review (with homogeneity) of case-control studies
3a-:Systematic review of case-control studies with worrisome heterogeneity
3b:Individual case-control study
4:Case-series (and poor quality cohort and case-control studies)
5:Expert opinion without explicit critical appraisal, or based on physiology, bench research or 'first principles'

Diagnosis:

1a: Systematic review (with homogeneity) of Level 1 diagnostic studies; or a clinical rule validated on a test set.
1a-:Systematic review of Level 1 diagnostic studies displaying worrisome heterogeneity
1b:Independent blind comparison of an appropriate spectrum of consecutive patients, all of whom have undergone both the diagnostic test and the reference standard; or a clinical decision rule not validated on a second set of patients
1c: Absolute SpPins And SnNouts (An Absolute SpPin is a diagnostic finding whose Specificity is so high that a Positive result rules-in the diagnosis. An Absolute SnNout is a diagnostic finding whose Sensitivity is so high that a Negative result rules-out the diagnosis).
2a:Systematic review (with homogeneity) of Level >2 diagnostic studies
2a-:Systematic review of Level >2 diagnostic studies displaying worrisome heterogeneity
2b: Any of: 1)independent blind or objective comparison; 2)study performed in a set of non-consecutive patients, or confined to a narrow spectrum of study individuals (or both) all of whom have undergone both the diagnostic test and the reference standard; 3) a diagnostic clinical rule not validated in a test set.
3a:Systematic review (with homogeneity) of case-control studies
3a-:Systematic review of case-control studies displaying worrisome heterogeneity
4:Any of: 1)reference standard was unobjective, unblinded or not independent; 2) positive and negative tests were verified using separate reference standards; 3) study was performed in an inappropriate spectrum of patients.
5:Expert opinion without explicit critical appraisal, or based on physiology, bench research or 'first principles'

Prognosis:

1a: Systematic review (with homogeneity) of inception cohort studies; or a clinical rule validated on a test set.
1a-:Systematic review of inception cohort studies displaying worrisome heterogeneity
1b: Individual inception cohort study with > 80% follow-up; or a clinical rule not validated on a second set of patients
1c:All or none case-series
2a: Systematic review (with homogeneity) of either retrospective cohort studies or untreated control groups in RCTs.
2a-:Systematic review of either retrospective cohort studies or untreated control groups in RCTs displaying worrisome heterogeneity
2b:Retrospective cohort study or follow-up of untreated control patients in an RCT; or clinical rule not validated in a test set.
2c:'Outcomes' research
4:Case-series (and poor quality prognostic cohort studies)
5: Expert opinion without explicit critical appraisal, or based on physiology, bench research or 'first principles'

Key to interpretation of practice guidelines

Agency for Healthcare Research and Quality:

A: There is good research-based evidence to support the recommendation.
B: There is fair research-based evidence to support the recommendation.
C: The recommendation is based on expert opinion and panel consensus.
X: There is evidence of harm from this intervention.

USPSTF Guide to Clinical Preventive Services:

A: There is good evidence to support the recommendation that the condition be specifically considered in a periodic health examination.
B: There is fair evidence to support the recommendation that the condition be specifically considered in a periodic health examination.
C:There is insufficient evidence to recommend for or against the inclusion of the condition in a periodic health examination, but recommendations may be made on other grounds.
D: There is fair evidence to support the recommendation that the condition be excluded from consideration in a periodic health examination.
E:There is good evidence to support the recommendation that the condition be excluded from consideration in a periodic health examination.

University of Michigan Practice Guideline:

A: Randomized controlled trials.
B: Controlled trials, no randomization.
C: Observational trials.
D: Opinion of the expert panel.

Other guidelines:

A: There is good research-based evidence to support the recommendation.
B: There is fair research-based evidence to support the recommendation.
C: The recommendation is based on expert opinion and panel consensus.
X: There is evidence that the intervention is harmful.

GRADE (Grading of Recommendations Assessment, Development and Evaluation)

Code Quality of Evidence Definition
A High

Further research is very unlikely to change our confidence in the estimate of effect.

  • Several high-quality studies with consistent results
  • In special cases: one large, high-quality multi-centre trial
B Moderate

Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.

  • One high-quality study
  • Several studies with some limitations
C Low

Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.

  • One or more studies with severe limitations
D Very Low

Any estimate of effect is very uncertain.

  • Expert opinion
  • No direct research evidence
  • One or more studies with very severe limitations

Source: GRADE (Grading of Recommendations Assessment, Development and Evaluation) Working Group 2007 1 (modified by the EBM Guidelines Editorial Team)