Date: November 13, 2024

Reference: Lee WH, et al. Study of Pediatric Appendicitis Scores and Management Strategies: A Prospective Observational Feasibility Study. Academic Emergency Medicine. Dec 2024

Guest Skeptic: Dr. Dennis Ren is a pediatric emergency medicine physician at Children’s National Hospital in Washington, DC. He’s also the host of SGEMPeds.

Case: A 10-year-old boy presents to the community emergency department (ED) with abdominal pain. It started last night but the pain seemed to worsen this morning. He tells you that it hurts right around his belly button. On examination, he looks uncomfortable but lets you examine his stomach. He winces a little as you press around his belly button and right lower quadrant but is not guarding. He has not had any fevers. His mother asks you, “I had something like this happen to me when I was a child. By the time they figured it out, the doctors told me that my appendix had almost burst! Do you think this could be appendicitis?”

Background: Pediatric appendicitis is the most common surgical emergency in children, accounting for a significant proportion of ED visits. Appendicitis occurs when the appendix becomes inflamed, often because of a blockage, leading to infection and potentially life-threatening complications such as perforation. Although the condition is more common in children between the ages of 10 and 20, it can present at any age, making accurate diagnosis in younger populations especially challenging [1].

The clinical presentation of pediatric appendicitis can vary widely. Classical symptoms include right lower quadrant abdominal pain, fever, and vomiting, but these can be absent or altered in younger children, making clinical diagnosis difficult. Furthermore, the differential diagnosis is broad, including conditions such as gastroenteritis, urinary tract infections, and other causes of abdominal pain like constipation. In one study, almost half of these pediatric patients (45%) with abdominal pain were discharged home with “non-specific abdominal pain” [2].  

Traditionally, diagnosis relies on a good history, followed by a directed physical examination and appropriate use of diagnostic tests (lab and imaging). Ultrasound is commonly used due to its non-invasive nature, while computed tomography (CT) scans, although more definitive, are often avoided in children due to radiation concerns [3].  Some centers are using rapid MRI clinical diagnostic pathways in suspected pediatric appendicitis​ [4,5].

Outcomes of appendicitis largely depend on early recognition and treatment. If left untreated, the appendix may rupture, leading to peritonitis, abscess formation, or sepsis, which significantly increases morbidity. On the other hand, early surgical intervention, typically via laparoscopic appendectomy, results in low complication rates and rapid recovery for most pediatric patients​. 

Clinical prediction scores (CPS) exist to help diagnose appendicitis in children. They often consider aspects of the history, physical exam and laboratory values. However, these CPSs are not universally used or validated. Three of these CPSs are the Alvarado score [6], Pediatric Appendicitis Score (PAS) [7], and pediatric Appendicitis Risk Calculator for pediatric EDs (pARC-ED) [8].  We also don’t know how they compare to our clinical gestalt.

I remember a case I saw as a resident of a young girl who was sent to the ED for belly pain and to be evaluated for appendicitis. Her exam was unremarkable. She didn’t have a fever. She didn’t look sick. I pressed all over her stomach. I had her jump in the air. I looked for the Rosving’s and Psoas’s sign. Everything was negative. Her caretaker also told me that her belly pain was like when she had constipation in the past.  It was my attending, whose Spidey senses were tingling, ordered an ultrasound…ruptured appendix.


Clinical Question: Can pediatric appendicitis clinical prediction scores accurately diagnose appendicitis in children and outperform clinician gestalt?


Reference: Lee WH, et al. Study of Pediatric Appendicitis Scores and Management Strategies: A Prospective Observational Feasibility Study. Academic Emergency Medicine. Dec 2024

  • Population: Children aged 5 to 15 years presenting with right-sided or generalized abdominal pain and suspected appendicitis
    • Excluded: abdominal trauma within 7 days of presentation, history of prior abdominal surgery, chronic illness affecting the abdomen (inflammatory bowel disease, chronic pancreatitis, cystic fibrosis, sickle cell anemia), pregnancy, inability to obtain accurate history.
  • Intervention: Use of clinical prediction scores (Alvarado, PAS, pARC-ED)
  • Comparison: Clinician gestalt
  • Outcome: Diagnostic accuracy (AUC, sensitivity, and specificity)
  • Type of Study: Prospective observational study

Dr. Wei Hao Lee

This is an #SGEMHOP, and we are pleased to have the lead author, Dr. Wei Hao Lee, on the show. Dr. Lee is a pediatric advanced trainee currently working at Perth Children’s Hospital in Western Australia. He is interested in pediatric emergency medicine and clinical research. Dr. Lee is investigating clinical prediction scores in pediatric appendicitis, for which he is undertaking a PhD at the University of Western Australia. 

Authors’ Conclusions: The study identified 30 clinical prediction scores that could be validated in a majority of patients to compare their ability to assess risk of pediatric appendicitis. The pARC-ED had the highest predictive accuracy and can potentially assist in risk stratification of children with suspected appendicitis in pediatric EDs. A multicenter study is now under way to evaluate the potential of these CPSs in a broader range of EDs to aid clinical decision making in more varied settings.”

Quality Checklist for Clinical Decision Tools:

  1. The study population included or focused on those in the ED. Yes
  2. The patients were representative of those with the problem. Yes 
  3. All important predictor variables and outcomes were explicitly specified. Yes
  4. This is a prospective, multicenter study including a broad spectrum of patients and clinicians (level II). No
  5. Clinicians interpret individual predictor variables and score the clinical decision rule reliably and accurately. Unsure
  6. This is an impact analysis of a previously validated CDR (level I). No
  7. For Level I studies, impact on clinician behavior and patient-centric outcomes is reported.  N/A
  8. The follow-up was sufficiently long and complete. Yes
  9. The effect was large enough and precise enough to be clinically significant. Yes
  10. 10.Funding. Channel 7 Telethon Trust. CAHS Telethon Research Scholarship
  11. 11.Conflicts of Interest. None declared

Results: They enrolled 481 patients. The median age was around 10 years old, and just over half were male. Most of the children (93%) had bloodwork obtained. Around 75% had ultrasound imaging. Only 3.5% had a CT scan. Around 30% (150/481) had appendicitis, with three children (2%) having a normal appendix on histopathology 

They identified 30 CPS in the literature search. They were able to collect the required variables for the CPSs ranging from 52% to 93%.


Key Results: The pediatric Appendicitis Risk Calculator for pediatric EDs (pARC-ED) was the best performing CPS with clinical gestalt after blood test results being similar.


  • Primary Outcome: Diagnostic accuracy
    • pARC-ED had an AUC 0.9 (95% CI 0.86-0.94) and an accuracy of 97.5 (95% CI 95.1-98.7) and specificity of 99%
    • Clinician gestalt after blood test results had an AUC 0.88 (95% CI 0.81-0.94) and an accuracy of 91.6 (95% CI 86.1-95)

Listen to the SGEM podcast to hear Dr. Lee’s response to our five nerdy questions.

1. Excluded Patients: Your study team decided to exclude patients with “chronic illnesses affecting the abdomen.” That included inflammatory bowel disease, chronic pancreatitis, cystic fibrosis, and sickle cell anemia. You also excluded pregnant patients. Why did you decide to exclude these patients from your study?

Having not reviewed all the CPSs your study team did, were these populations excluded from those studies as well? Because while these patients may have other reasons to have abdominal pain, they can get appendicitis too!

2. Hawthorne Effect: The treating clinicians filled out a standardized clinical report form looking for data points related to the diagnosis of appendicitis. They were also asked how likely it was that the patient had appendicitis twice during the ED course. How do you think this could have impacted the accuracy of clinical gestalt?

3. Missing Variables: Based on supplemental Table 2, it looks like you were trying to collect 29 variables represented on various CPS. A highly challenging and ambitious task! Your success in collecting these variables ranged from 52% to 93%. How do you think these missing variables could have impacted your results?

4. Diagnosis of Appendicitis: There were many ways that the study team determined the diagnosis of appendicitis. 

  • Histopathological report of acute appendicitis within 30 days of index presentation
  • Operative findings of appendicitis or appendiceal abscess report requiring percutaneous drainage
  • Interval appendectomy performed within 60 days of presentation

This can lead to a partial verification bias (referral bias, work-up bias). This happens when only a certain set of patients who underwent the index test is verified by the reference standard (surgery with histopathological report). This can increase sensitivity but decrease specificity.

We are curious about the definitions you used, especially the last one with interval appendectomy performed within 60 days. Why did you choose this length of time? It seems long.

You also report one family of a patient diagnosed with appendicitis declining surgery who was discharged on antibiotics without re-presentation. Around 20% of patients were lost to follow-up at 30 and 60 days. Could it be possible that some of them were also treated non-operatively?

5. Management of Appendicitis: We found it interesting that while most patients (93%) had blood work obtained, only 80% had imaging, ultrasound or CT, performed. This is a bit different from my practice. Many times, I am making the diagnosis of appendicitis with ultrasound first without any additional lab studies.

Dr. Ross Fisher

Can you talk about the typical workflow at your institution and compare it to what typically would happen at a community emergency department?

I can’t remember the last time a surgeon took a patient to the operating room for appendectomy without imaging confirmed appendicitis. Mr. Ross Fisher, pediatric surgeon, can you comment on what happens in the UK?

Comment on Authors’ Conclusion Compared to SGEM Conclusion: We agree with the authors’ conclusion and look forward to the multicenter study.


SGEM Bottom Line: The pARC-ED CPS in conjunction with clinician gestalt can be helpful in risk-stratifying patients with suspected appendicitis.


Case Resolution: You order some basic blood work on the boy and give him a dose of ibuprofen. The labs come back demonstrating a normal white blood cell count. Based on the pARC-ED CPS, this lands him in the low-risk group. On re-evaluation, he states that his abdominal pain has improved, and he is eager to go home. The repeat examination of his abdomen is unremarkable. You provide anticipatory guidance and return precautions to the family before they leave the ED.

Dr. Dennis Ren

Clinically Application: Clinical prediction scores can be used in conjunction with clinician gestalt to risk stratify patients presenting with suspected appendicitis. Ultrasound is the preferred first-line imaging for diagnosing appendicitis in children. If ultrasound imaging is equivocal or there remains high suspicion for appendicitis, CT imaging is reasonable, and an MRI may be available in some places.  

As always, consider the three pillars of EBM (clinical judgement, scientific literature, and patient/family values and preferences) and engage in shared decision-making with families and children, as sometimes the final decision will be based on the risk tolerances of individual clinicians and families. In the case that a patient is discharged from the ED with some uncertainty, providing appropriate anticipatory guidance and return precautions is important.   

What Do I Tell the Patient/Parent? I’m sorry your child is having belly pain. I am glad you brought up your concern for appendicitis. In the ED, we are trained to consider a wide range of diagnoses. Appendicitis is something I am thinking about as well, based on his story and physical exam. His labs are reassuring, his pain has improved, and his belly exam is reassuring. I have low suspicion right now that he has appendicitis. Let’s talk about some options for the next steps.

Keener Kontest: There was no winner last week. The question was what chemical element is feared the most by data scientists? The answer, sodium (Na) because it symbolizes “Not Available” (NA) values in datasets. These missing values are a common headache in data analysis, as they can disrupt calculations, skew results, or require imputation strategies to handle them.

Listen to the SGEM podcast for this week’s question. If you know, then send an email to thesgem@gmail.com with keener in the subject line. The first correct answer will receive a shoutout on the next episode.

SGEMHOP: Now it is your turn SGEMers. What do you think of this episode on CPS for diagnosing pediatric appendicitis? Tweet your comments using #SGEMHOP.  What questions do you have for xxxx and their team, ask them on the SGEM blog. The best social media feedback will be published in AEM.

Other SGEM Episodes:

  • SGEM #384: Take Me Out Tonight, I Don’t Want to Perforate My Appendix Alright
  • SGEM #180: The First Cut is the Deepest- N.O.T. for Paediatric Appendicitis
  • SGEM#155: Girls Just Want To Have Fun – Not Appendicitis

REMEMBER TO BE SKEPTICAL OF ANYTHING YOU LEARN, EVEN IF YOU HEARD IT ON THE SKEPTICS’ GUIDE TO EMERGENCY MEDICINE.


References:

  1. Krzyzak M, Mulrooney SM. Acute appendicitis review: background, epidemiology, diagnosis, and treatment. Cureus. 2020;12(6):e8562. doi:10.7759/cureus.8562
  2. Lee WH, O’Brien S, Skarin D, et al. Pediatric abdominal pain in children presenting to the emergency department. Pediatr Emerg Care. 2019;37(12):593-598. doi:10.1097/PEC.0000000000001789
  3. Bhangu A, Søreide K, Di Saverio S, Assarsson JH, Drake FT. Acute appendicitis: modern understanding of pathogenesis, diagnosis, and management. Lancet. 2015;386(10000):1278-1287.
  4. Schuh S, Man C, Marie E, Alhashmi GHA, Halevy D, Wales PW, Singer-Harel D, Finkelstein A, Sweeney J, Doria AS. Properties of ultrasound-rapid MRI clinical diagnostic pathway in suspected pediatric appendicitis. Am J Emerg Med. 2023 Sep;71:217-224. doi: 10.1016/j.ajem.2023.06.026. Epub 2023 Jun 16. PMID: 37453161.
  5. Ata NA, Trout AT, Dillman JR, Tkach JA, Ayyala RS. Technical and Diagnostic Performance of Rapid MRI for Evaluation of Appendicitis in a Pediatric Emergency Department. Acad Radiol. 2024 Mar;31(3):1102-1110. doi: 10.1016/j.acra.2023.09.040. Epub 2023 Oct 19. PMID: 37863782.
  6. Alvarado A. A practical score for the early diagnosis of acute appendicitis. Ann Emerg Med. 1986;15(5):557-564. 15532712, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/acem.14985, Wiley Online Library on [07/10/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons L
  7. Samuel M. Pediatric appendicitis score. J Pediatr Surg. 2002;37(6):877-881.
  8. Kharbanda AB, Vazquez-Benitez G, Ballard DW, et al. Development and validation of a novel pediatric appendicitis risk calculator (pARC). Pediatrics. 2018;141(4):e20172699. doi:10.1542/ peds.2017-2699

Area Under the Curve:

Sensitivity and specificity are often more appropriate for balanced datasets, whereas the AUC ROC is typically better suited for imbalanced datasets. Sensitivity measures the proportion of true positives correctly identified, and specificity measures the proportion of true negatives correctly identified. These metrics work well when the dataset has a balanced distribution of positive and negative cases. This is because sensitivity and specificity consider only the correct classification rates within each class, irrespective of the overall class distribution.

In an imbalanced dataset (where one class, such as the negative class, vastly outnumbers the positive class), sensitivity and specificity can be misleading. For example, a highly specific test might perform well by predicting almost everyone as negative in a rare disease scenario, but this would fail to identify actual positive cases, thus leading to poor clinical utility.

The AUC ROC represents the ability of the test to discriminate between positive and negative cases over all possible classification thresholds. It gives a summary measure of the trade-offs between sensitivity (true positive rate) and specificity (false positive rate) across different thresholds.

The AUC ROC is better suited to imbalanced datasets because it evaluates the classifier’s performance across a range of thresholds, providing a more nuanced assessment of how well the test separates positive and negative cases, even when one class dominates the other. Unlike simple sensitivity or specificity, AUC takes into account both false positives and false negatives, allowing it to remain informative even when class distribution is skewed.