Date: December 2o, 2024

Reference: Kotani et al. Positive single-center randomized trials and subsequent multicenter randomized trials in critically ill patients: a systematic review. Crit Care. 2023 

Guest Skeptic: Dr. Scott Weingart is an ED Intensivist from New York. He did fellowships in Trauma, Surgical Critical Care, and ECMO. He is a physician coach concentrating on the promotion of eudaimonia and optimal performance. Scott is best known for talking to himself about Resuscitation and Critical Care on a podcast called EMCrit, which has been downloaded more than 50 million times.  

Case: A 40-year-old male presents to the emergency department (ED) with severe respiratory failure from bilateral pneumonia. After a trial of Non-Invasive Positive Pressure Ventilation (NIPPV), you’ve decided to intubate him. Should your first pass attempt be done with a bougie or a styletted tube?

Randomization

Background: The role of single-center randomized controlled trials (sRCTs) in advancing medical knowledge is significant, especially in the field of emergency medicine (EM). These trials often serve as the initial foundation for exploring interventions, providing a focused and controlled environment to test hypotheses.

However, the applicability of their findings to broader clinical settings can be limited due to their localized context. Multi-center randomized controlled trials (mRCTs) are often seen as a necessary step to validate these findings across diverse patient populations and healthcare settings. This process of validation is critical, as it addresses external validity—a cornerstone of evidence-based practice​​.

Historically, the need to move from sRCTs to mRCTs arises from the recognition that different institutions have varied patient demographics, resources, and protocols that might influence outcomes. While sRCTs provide essential insights, their ability to reflect real-world complexities is inherently restricted​​. Emergency physicians, who operate in unpredictable environments, often rely on evidence that is robust across multiple settings to guide clinical decisions effectively.

Despite the apparent hierarchical superiority of mRCTs, there are debates about whether they consistently confirm the results of sRCTs. This discussion is pivotal in understanding how findings can be generalized and integrated into clinical guidelines​​. As emergency physicians, evaluating the interplay between sRCTs and mRCTs not only helps in assessing the reliability of evidence but also in shaping the way we approach the implementation of interventions in our practice.


Clinical Question: How often are single-centred RCTs of critically ill patients reporting a mortality benefit confirmed in a multi-centred RCT?


Reference: Kotani et al. Positive single-center randomized trials and subsequent multicenter randomized trials in critically ill patients: a systematic review. Crit Care. 2023 

  • Population: sRCTs published in high-impact journals (NEJM, JAMA, or Lancet) that reported statistically significant mortality reductions in critically ill patients.
    • Exclusions: Quasi-randomized or non-randomized methodologies, multicentric trials, pediatric populations, and studies lacking mortality data.
  • Intervention: sRCTs
  • Comparison: mRCTs
  • Outcome:
    • Primary Outcome: Mortality assessed at specified time points such as hospital discharge or predefined follow-up periods.
    • Secondary Outcomes: Guideline utilization of sRCT results, subsequent guideline changes based on mRCT
  • Type of Study: Systematic review that followed the PRISMA guidelines and was registered in the PROSPERO International Prospective Register of Systematic Reviews

Authors’ Conclusions: “Mortality reduction shown by sRCTs is typically not replicated by mRCTs. The findings of sRCTs should be considered hypothesis-generating and should not contribute to guidelines.”

Quality Checklist for Systematic Reviews:

  1. Was the main question being addressed clearly stated? Yes
  2. Was search for studies was detailed and exhaustive? Yes
  3. Were the criteria used to select articles for inclusion appropriate? Yes
  4. Were the included studies sufficiently valid for the type of question asked? Yes
  5. Were the results similar from study to study? Unsure 
  6. Were there any financial conflicts of Interest? No
  7. Who funded the study? The review was funded by academic institutions. 

Results: The review included 19 sRCTs and 24 subsequent mRCTs. Sixteen sRCTs addressed were followed up by mRCTs. The majority of mRCTs found no mortality difference compared to the significant findings of sRCTs​​.


Key Result: Single-centred RCTs often do not replicate in multi-centred RCTs.


  • Primary Outcome: Only one out of 16 (6%) sRCT’s findings were confirmed by mRCTs.
  • Secondary Outcomes: 14 sRCTs were referenced at least once in international guidelines. Of those, 43% (6/14) have since been either suggested against or removed in the most recent versions of the guidelines.

1) PRISMA: Kotani et al. adhered to several essential PRISMA checklist items but did fall short on key areas such as providing the full search strategy, reporting bias, certainty assessment, and detailed risk of bias assessment. The study does not fully satisfy the PRISMA 2020 quality criteria.[1] (see attached table).

2) Publication Bias: This occurs because the likelihood of research results being published is influenced by the nature and direction of the findings. Studies with “positive”, statistically significant, or novel results are more likely to be published, while those with “negative” or inconclusive outcomes often remain unpublished or delayed.

It is possible to quantify publication bias. A systematic review found that studies reporting significant outcomes were more likely to be published than those without, with a pooled odds ratio of 2.8 (95% CI: 2.2 to 3.5).[2] This indicates that studies with significant results had 2.8 times higher odds of being published compared to studies with non-significant results.

This imbalance can skew the body of available evidence, leading to overestimation of intervention effects, misrepresentation of true outcomes, and flawed decision-making in clinical practice, policy development, or future research. We should try and move away from thinking of studies as positive or negative. If you have asked a good question and used appropriate methods, then it does not matter if the results are positive or negative. Science has moved forward, and these results should be part of the medical literature to minimize publication bias.

3) Heterogeneity in Study Populations: Variability in patient demographics, settings, and interventions in mRCTs vs. sRCTs might contribute to conflicting results.​​

4) sRCTs: Single-center studies often have unique settings or expertise that may not be generalizable to multicenter trials​​. They also often have smaller sample sizes, increasing the risk of Type I errors compared to larger mRCTs​​. Here are some examples to discuss this topic area:

  • LEUVEN Trial – van den Berghe et al. Intensive insulin therapy in critically ill patients. N Engl J Med. 2001 Nov (Type I Error?)
  • Early Goal Direct Therapy – Rivers et al Early goal-directed therapy in the treatment of severe sepsis and septic shock. N Engl J Med. 2001 Nov (Hidden Confounders & Control Group Evolution?) SGEM#69 and SGEM#92
  • PREOXI (note this was mRCT) – Gibbs et al. Noninvasive Ventilation for Preoxygenation during Emergency Intubation. N Engl J Med. 2024 Jun (Hidden Confounder?) SGEM#447
  • BEAM – Driver et al. Effect of Use of a Bougie vs Endotracheal Tube and Stylet on First-Attempt Intubation Success Among Patients With Difficult Airways Undergoing Emergency Intubation: A Randomized Clinical Trial. JAMA. 2018 Jun (Different Skil Sets?) SGEM#271

5) Guidelines: They are to guide care and not to be considered or used as GODlines. Research indicates that the validity of guideline recommendations diminishes over time. A study published in the Canadian Medical Association Journal(CMAJ) analyzed the lifespan of clinical guideline recommendations and found that approximately 90% remained valid after one year.[3] However, this validity decreased to about 81% after three years and 78% after four years.

This data suggests that a significant proportion of recommendations may become outdated within a few years of publication. In the study, we are reviewing today, of the 14 sRCTs referenced at least once in international guidelines, six (43%) have since been either removed or suggested against in the most recent versions of relevant guidelines.

This informs my position that we should be skeptical of the push to blindly follow guidelines when we are pressured by organizations like the American Heart Association to “get with the guidelines”.[4] The recommendations are often not based on high-quality evidence.[5] How closely should we adhere to a specific recommendation (25%, 50%, 75% or 100%)? We know that in the EBM framework, the literature is only one of three pillars. We still need to use our clinical judgement and ask the patient about their preferences and values.

Comment on Authors’ Conclusion Compared to SGEM Conclusion: We generally agree with the authors’ conclusions that the results of sRCTs should be considered hypothesis-generating and should not contribute to clinical practice guidelines.


SGEM Bottom Line: Be skeptical of accepting the conclusions of sRCT unless you can precisely duplicate the conditions that led to positive sRCT results.


Case Resolution: You choose to use the bougie on your first pass because you have trained extensively with the device and believe you are more akin to the BEAM trial clinicians than the bougie trial clinicians. 

Clinical Application: sRCT reporting a mortality benefit in critically ill patients often does not replicate in mRCTs and should be considered hypothesis-generating rather than definitive​. They probably should only be used to change clinical practice if you know exactly what went into the intervention, especially behind the scenes and can precisely replicate those methods.

Keener Kontest: Back-to-back wins for Brian Caldwell. He knew the opioid epidemic was declared a public health emergency in 2017 and that Taylor Swift’s favourite number is 13.

Listen to the SGEM podcast for this week’s question. If you know, then send an email to thesgem@gmail.com with “keener” in the subject line. The first correct answer will receive a shoutout on the next episode. 


Remember to be skeptical of anything you learn, even if you heard it on the Skeptics Guide to Emergency Medicine.


References:

  1. PRISMA 2020 checklist https://www.prisma-statement.org/prisma-2020-checklist  Accessed Dec 11, 2024
  2. DeVito NJ, Goldacre B Catalogue of bias: publication bias BMJ Evidence-Based Medicine 2019;24:53-54.
  3. Martínez García L, et al. Updating Guidelines Working Group. The validity of recommendations from clinical guidelines: a survival analysis. CMAJ. 2014 Nov 4;186(16):1211-9. doi: 10.1503/cmaj.140547. Epub 2014 Sep 8. Erratum in: CMAJ. 2017 Jan 9;189(1):E33. doi: 10.1503/cmaj.161414. PMID: 25200758; PMCID: PMC4216254.
  4. American Heart Association Get With the Guidelines. https://www.heart.org/en/professional/quality-improvement/get-with-the-guidelines  Accessed December 11, 2024
  5. Fanaroff AC et al. Levels of Evidence Supporting American College of Cardiology/American Heart Association and European Society of Cardiology Guidelines, 2008-2018. JAMA. 2019 Mar 19;321(11):1069-1080. doi: 10.1001/jama.2019.1122. PMID: 30874755; PMCID: PMC6439920.