All Sepsis Is Not the Same

This is a fairly dense informatics evaluation of sepsis, but it boils down to a general hypothesis with some face validity: all sepsis is not the same! This is abundantly obvious from the various clinical manifestations of response to infection, with a spectrum ranging from Group A Streptococcal pharyngitis to gram-negative bacteremia and distributive shock.

This analysis uses genetic expression sampling from whole blood to perform unsupervised machine learning analyses and clustering, and they identify three subtypes the authors term “Inflammopathic, Adaptive, and Coagulopathic”. Whether these are terribly illustrative of the underlying pathology is unclear, but, if you want to be in one of these clusters, you want to be in “Adaptive” with its 8.1% mortality – compared to 29.8% in Inflammopathic and 25.4% in Coagulopathic.

Validity of this specific analysis aside, it’s an interesting example of what may ultimately be a useful approach to treating sepsis – targeting the specific underlying genetic expressions associated with dysregulated immune response or underlying end-organ dysfunction. The best thing about this paper, however, are the acronyms reported for some of the statistical methods: “COmbined Mapping of Multiple clUsteriNg ALgorithms” or COMMUNUAL, and “COmbat CO-Normalization Using conTrols” or COCONUT.

“Unsupervised Analysis of Transcriptomics in Bacterial Sepsis Across Multiple Datasets Reveals Three Robust Clusters”
https://www.ncbi.nlm.nih.gov/pubmed/29537985

Yet More Love for the Alvarado Score?

The Alvarado Score for appendicitis has been around a long time – 1986, to be precise. Initially proposed to help with the diagnosis of appendicitis prior to advanced imaging, in a different surgical and observation culture, it has effectively been replaced in cost-effectiveness and timeliness by CT. These authors, however, want to resurrect it in the face of increasing CT overuse.

These authors perform a simple retrospective review of CTs performed at their single institution evaluating patients for abdominal pain, and then retrospectively calculated Alvarado scores from medical record review. Methods are incompletely described, but, effectively, they found about 20% of their 492 cases were Alvarado score ≥9 – and all had appendicitis – or Alvarado score ≤2 – and nearly all were absent appendicitis – and suggest any patients with scores at those extremes should not receive imaging, thus reducing ED length-of-stay and radiation exposure.

It should probably say something that a clinical tool has been around over 30 years without truly gaining traction. I don’t think their proposal is unreasonable – whether using Alvarado or gestalt – to consult prior to imaging for cases with high clinical likelihood, or to discharge with return instructions for cases that are inconsistent with the diagnosis. Cultural factors need to change on both ends of the spectrum, however, to support imaging reduction practice change. Finally, despite the commanding nature of their article title, this is hardly the level of evidence or statistical power to truly describe the safety or effectiveness of this strategy – there are more patients here than in many prospective studies, but this does not replace a well-designed trial, or even some sort of pre-/post-intervention report.

Also: “This study was presented at the 75th meeting of the American Association for the Surgery of Trauma, September 14-17th, 2016 in Waikoloa, Hawaii.” Ah, nice.

“The Alvarado Score Should Be Used To Reduce Emergency Department Length of Stay and Radiation Exposure in Select Patients with Abdominal Pain”

https://www.ncbi.nlm.nih.gov/pubmed/29521805

Useful Pregnancy-Related Guidance

So, I might be alone here, but in my canvassing the interesting literature this morning, I stumbled across this Clinical Expert Series in Obstetrics & Gynecology and thought: “This is great! I wish we’d (myself and my wife) had this four years ago!”

It’s a concise summary of the evidence (mostly lack thereof) and recommendations for things pregnant women “should and should not routinely do during pregnancy.” There’s so much nonsense floating around on the internet, and so many dark wombat burrows full of sinister imaginings, I almost feel this would be a good document to hand out to pregnant patients with adequate levels of health literacy.

A few highlights:

  • Prenatal vitamins are unlikely to be harmful, but potentially unnecessary for women already consuming balanced diets.
  • Alcohol consumption up to 7-9 drinks per week does not appear to be harmful, but no specific threshold for safety is known.
  • Up to 300 mg/d of caffeine is probably safe.
  • Examples of good fish are anchovies, Atlantic herring, Atlantic mackerel, mussels, oysters, wild salmon, sardines, snapper, and trout.
  • Appropriately prepared sushi is unlikely to give you a tapeworm.
  • “Pregnant women should avoid foods that are being recalled for possible Listeria contamination.”
  • Using toxic insect repellent is probably safer than the risk of insect-borne illness, where appropriate.

… and many more!  Useful!

“Dos and Don’ts in Pregnancy – Truths and Myths”
https://www.ncbi.nlm.nih.gov/pubmed/29528917

Wake Up And Smell the Isopropyl

Why? It’s just as good or better than the sweet, sweet taste of ondansetron dissolving under your tongue.

This is a rather small, but quite interesting trial, building upon prior work evaluating the use of inhaled isopropyl alcohol for nausea. It’s better than saline placebo, yes, but what about those actual doctor-type medicines we use so can bill as a Level 3 or Level 4 visit?

This three-arm trial randomized 40 patients each into inhaled isopropyl + ondansetron oral dissolving tablet, inhaled isopropyl + oral placebo, and inhaled placebo + ondansetron oral dissolving tablet. Even despite the limitations of sample size with regard to statistical significance, the isopropyl arms are the clear winners with regard to their primary outcome of nausea score reduction at 30 minutes. Objective outcome measures were mixed – receipt of rescue antiemetics mirrored the primary outcome, but measures of ED length-of-stay and admission disposition could not demonstrate a difference.

Some fun tidbits here – patients were allowed to have an unlimited supply of alcohol medication pads to use throughout their ED stay, not just on initial arrival. They did not quantify how many pads were utilized by patients included in these arms. The authors also evaluated the effectiveness of blinding on their study, and, as expected, found it’s hard to miss the distinctive scent of isopropyl alcohol – and this introduces a potential source of bias to these results.

Overall, at least, it certainly seems reasonable to use isopropyl alcohol pads as adjunctive therapy for nausea in the ED – and as an inexpensive, over-the-counter option for patients (well, and doctors) at home.

“Aromatherapy Versus Oral Ondansetron for Antiemetic Therapy Among Adult Emergency Department Patients: A Randomized Controlled Trial”
https://www.ncbi.nlm.nih.gov/pubmed/29463461

A Non-Non-Answer in Airway Management in OHCA

In a lovely demonstration of the statistical inanity of non-inferiority trials, these authors present a simultaneously insightful and illogical data set examining airway management strategies during CPR in out-of-hospital arrest.

This is a clinical trial from France, randomizing patients out cardiac arrest to either bag-valve mask ventilation or placement of an advanced airway during CPR. Patients with significant challenges associated with BVM could cross over to ETI, and if patients achieved return of spontaneous circulation at any time, they were subsequently intubated as well. All emergency response and airway management was supervised by an emergency physician. Groups were fairly well matched by their randomization by center, and about 10% of the cohort crossed over to ETI from BVM due to failure of ventilation or gastric regurgitation.

Without wallowing too much in the statistical underpinnings, these authors defined a 1% absolute difference in favorable neurologic outcome at 30 days as their primary outcome measure for non-inferiority. Then, if non-inferiority was unable demonstrated, a test of difference would be performed for inferiority.

And, so, after all this, CPC 1-2 survival was: 4.2% in the BVM group and 4.3% in the ETI group, for a difference of 0.11% (1-sided 97.5% CI, −1.64% to infinity). It should be abundantly obvious that – considering the obviousness of their result and its gross failure to meet their statistical threshold – their sample size is completely inadequate. It then unsurprisingly follows their test of difference does not demonstrate inferiority of BMV as compared to ETI.

So, yes, BMV is not inferior, but not non-inferior, to ETI. This is why everyone hates Journal Club.

In the bigger picture, these data generally support deferring advanced airway placement during CPR. These authors did not observe any differences in low-flow time, even in their ETI group – but I would expect this might be a “best case scenario” with respect to minimal interruption and successful airway management.  Considering they have an EP assisting in the resuscitation and airway management, I would probably expect other settings are more prone to airway management failure and interruptions – and, really, the onus should be on those doing more to find solid data to back up their prehospital intervention.

“Effect of Bag-Mask Ventilation vs Endotracheal Intubation During Cardiopulmonary Resuscitation on Neurological Outcome After Out-of-Hospital Cardiorespiratory Arrest A Randomized Clinical Trial”
https://jamanetwork.com/journals/jama/article-abstract/2673550

The Elephant in the PECARN/CHALICE/CATCH Room

A few months ago, I wrote about the main publication from this study group – a publication in The Lancet detailing a robust performance comparison between the major pediatric head injury decision instruments. Reading between the lines, as I mentioned then, it seemed as though the important unaddressed result was how well physician judgment performed – only 8.3% of the entire cohort underwent CT.

This, then, is the follow-up publication in Annals of Emergency Medicine focusing on the superiority of physician judgment. Just to recap, this study assessed 18,913 patients assessed to have had a mild head injury. Of these, 160 had a clinically important traumatic brain injury and 24 underwent neurosurgery. The diagnostic performance of these decision instruments is better detailed in the other article but, briefly, for ciTBI:

  • PECARN – ~99% sensitive, 52 to 59.1% specific
  • CHALICE – 92.5% sensitive, 78.6% specific
  • CATCH – 92.5% sensitive, 70.4% specific

These rules, given their specificity, would commit patients to CT scan rates of 20-30% in the case of CHALICE and CATCH, and then an observation or CT rate of ~40% for PECARN. But how did physician judgment perform?

  • Physicians – 98.8% sensitive, 92.4% specific

Which is to say, physicians missed two injuries – each detected a week later in follow-up for persistent headaches – but only performed CTs in 8.3% of the population. As I highlighted in this past month’s ACEPNow, clinical decision instruments are frequently placed on a pedestal based on their own performance characteristics in a vacuum, and rarely compared with clinician judgment – and, frequently, clinician judgment is as good or better. It’s fair to say these head injury decision instruments, depending on the prevalence of injury and the background level of advance imaging, may actually be of little value.

“Accuracy of Clinician Practice Compared With Three Head Injury Decision Rules in Children: A Prospective Cohort Study”
http://www.annemergmed.com/article/S0196-0644(18)30028-3/fulltext

On Anesthesiology Knows Sedation

“These guidelines are intended for use by all providers who perform moderate procedural sedation and analgesia in any inpatient or outpatient setting …”

That is to say, effectively by fiat, if you perform procedural sedation, these guidelines apply to YOU.

This is a publication by the American Society of Anesthesiologists, and sponsored by various dental and radiology organizations. This replaces a 2012 version of this document – and it has changed for both better and worse.

Falling into the “better” column of this document, this guideline no longer perpetuates the myth of requiring a period of fasting prior to an urgent or emergent procedure. Their new recommendation:

“In urgent or emergent situations where complete gastric emptying is not possible, do not delay moderate procedural sedation based on fasting time alone”

However, some things are definitely “worse”. By far the largest problem with these guidelines – reflecting the exclusion of emergency medicine and critical care specialties from the writing or approving group – is their classification of propofol and ketamine as agents intended for general anesthesia. They specifically differentiate practice with these agents from the use of benzodiazepines or adjunctive opiates by stating:

“When moderate procedural sedation with sedative/ analgesic medications intended for general anesthesia by any route is intended, provide care consistent with that required for general anesthesia.”

These guidelines do not describe the care of patients receiving general anesthesia, but, obviously, we are not performing general anesthesia in the Emergency Department – and, I expect most hospitals do not credential their Emergency Physicians for general anesthesia. The impact of these guidelines in a practical sense on individual health system policy is unclear, particularly in the context of safe use of these medications by EPs for decades, but it’s certainly just one more pretentious obstacle to providing safe and effective care for our patients.

“Practice Guidelines for Moderate Procedural Sedation and Analgesia 2018”

http://anesthesiology.pubs.asahq.org/article.aspx?articleid=2670190

“The Newest Threat to Emergency Department Procedural Sedation”

https://www.ncbi.nlm.nih.gov/pubmed/29429580

Using PERC & Sending Home Pulmonary Emboli For Fun and Profit

The Pulmonary Embolism Rule-Out Criteria have been both lauded and maligned, depending on which day the literature is perused. There are case reports of large emboli in patients who are PERC-negative, as well as reports of PE prevalence as high as 5% – in contrast to its derivation meeting the stated point of equipoise at <1.8%. So, the goal here is to be the prospective trial to end all trials and most accurately describe the impact of PERC on practice and outcomes.

This is a cluster-randomized trial across 14 Emergency Departments across France.  Centers were randomized to either a PERC-based work-up strategy for PE, or “conventional” in which virtually every patient considered for PE was tested using D-dimer. Interestingly, these 14 centers also crossed-over to the alternative algorithm approximately halfway through the study period, so every ED was exposed to both interventions – some of which used PERC first, and vice versa.

Overall, they recruited 1,916 patients across the two enrollment periods, and these authors focused on the 1,749 who received per-protocol testing and were not lost to follow-up. The primary outcome was any new diagnosis of venous thromboembolism at 3 month follow-up.  This was their measure of, essentially, clinically important missed VTE upon exiting their algorithm. The headline results here were, in their per-protocol population, that 1 patient was diagnosed with VTE in follow-up in the PERC group compared with none in the control cohort. This met their criteria for non-inferiority, and, just at face value, the PERC-based strategy is clearly reasonable. There were 48 patients lost to follow-up, however, but given the overall prevalence of PE in this population, it is unlikely these lost patients would have affected the overall results.

There are a few interesting bits to work through from the characteristics of the study cohort. The vast majority of patients considered for the diagnosis of PE were “low risk” by either Wells or simplified Revised Geneva Score. However, 91% of those in the PERC cohorts were “low risk”, as compared to 78% in the control cohort – which, considering the structure of this trial, seems unlikely to have occurred by chance alone. In the PERC cohort, about half failed to meet PERC and these patients – plus a few protocol violations – moved forward with D-dimer testing. In the conventional cohort, 99% were tested with D-dimer in accordance with their algorithm.

There were then, again, more odd descriptive results at this point.  The results of the D-dimer testing (≥0.5 µg/mL) were positive in 343 of the PERC cohort and 471 of the controls. However, physicians only moved forward with CTPA in 38% of the PERC cohort and 46% of the conventional cohort.  It is left entirely unaddressed why patients entered a PE rule-out pathway and ultimately never received a definitive imaging test after a D-dimer above threshold. For what it’s worth, then, the fewer patients undergoing evaluation for PE in the PERC cohort led to fewer diagnoses of PE, fewer downstream hospital admissions and anticoagulants, and their ED length of stay was shorter. The absolute numbers are small, but patients in the control cohort undergoing CTPA were more likely to have subsegmental PEs (5 vs. 1), which, again, ought to generally make sense.

So, finally, what is the takeaway here? Should you use a PERC-based strategy? As usual, the answer is: it depends. Firstly, it is almost certainly the case the PERC-based algorithm is safe to use. Then, if your current approach is to carpet bomb everyone with D-dimer and act upon it, yes, you may see dramatic improvements in ED processes and resource utilization. However, as we see here, the prevalence of PE is so low, strict adherence to a PERC-based algorithm is still too clinically conservative. Many elevated D-dimers did not undergo CTPA in this study – and, with three month follow-up, they obviously did fine. Frankly, given the shifting gestalt relating to the work-up of PE, the best cut-off is probably not PERC, but simply stopping the work-up of most patients not intermediate- or high-risk.

“Effect of the Pulmonary Embolism Rule-Out Criteria on Subsequent Thromboembolic Events Among Low-Risk Emergency Department Patients: The PROPER Randomized Clinical Trial”
https://jamanetwork.com/journals/jama/fullarticle/2672630

EDACS vs. HEART – But Why?

The world has been obsessed over the past few years with the novelty of clinical decision rules for the early discharge of chest pain. After several years of battering the repurposed Thrombolysis in Myocardial Infarction (TIMI) score, History, Electrocardiogram, Age, Risk factors and Troponin (HEART) became ascendant, but there are several other candidates out there.

One of these is Emergency Department Assessment of Chest pain Score (EDACS), which is less well-known, but has reasonable face validity.  It does a good job identifying a “low-risk” cohort, but is more complicated than HEART. There is also a simplified version of EDACS that goes ahead and eliminates some of the complicated subtractive elements of the score. This study pits these various scores head-to-head in the context of conventional troponin testing, as well.

This is a retrospective review of 118,822 patients presenting to Kaiser Northern California Emergency Departments, narrowing the cohort to those whose initial Emergency Department evaluation was negative for acute coronary syndrome. The 60-day MACE (composite of myocardial infarction, cardiogenic shock, cardiac arrest, and all-cause mortality) in this cohort was 1.9%, most of which were acute MI. Interestingly, these authors chose to present only the negative predictive value of their test characteristics, which means – considering such low prevalence – the ultimate rate of MACE in all the low-risk cohorts defined by each decision instrument were virtually identical. Negative predictive values of all three scores depended primarily on the troponin cut-off used, and were ~99.2% for ≤0.04 ng/mL, and ~99.5% for ≤0.02 ng/mL. The largest low-risk cohort by definition was with the original EDACS rule, exceeding the HEART score classification by an absolute quantity of about 10% of the total cohort, regardless of the troponin cut-off used.

The editorial accompanying the article goes on to laud these data as supporting the use of these tools for early discharge from the Emergency Department. However, this is an outdated viewpoint, particularly considering the data showing early non-invasive evaluations are of uncertain value. In reality, virtually all patients who have been ruled-out for ACS in the ED can be discharged home, regardless of risk of MACE. The value of these scores is probably less so in determining who can be discharged, but rather in helping triage patients for closer primary care or specialist follow-up.  Then, individualized plans can be developed for optimal medical management, or for assessment of the adequacy of the coronary circulation, to prevent what MACE is feasible to be prevented.

“Performance of Coronary Risk Scores Among Patients With Chest Pain in the Emergency Department”
http://www.onlinejacc.org/content/71/6/606

“Evaluating Chest Pain in the Emergency Department: Searching for the Optimal Gatekeeper.”
http://www.onlinejacc.org/content/71/6/617

The qSOFA Story So Far

What do you do when another authorship group performs the exact same meta-analysis and systematic review you’ve been working on – and publishes first? Well, there really isn’t much choice – applaud their great work and learn from the experience.

This is primarily an evaluation of the quick Sequential Organ Failure Assessment, with a little of the old Systemic Inflammatory Response Syndrome thrown in for contextual comparison. These studies included those in the Intensive Care Unit, hospital wards, and Emergency Departments. Their primary outcome was mortality, reported in these studies mostly as in-hospital mortality, but also 28-day and 30-day mortality.

The quick synopsis of their results, pooling 38 studies and 383,333 patients, mostly from retrospective studies, and mostly from ICU cohorts:

  • qSOFA is not terribly sensitive, particularly in the settings in which it is most relevant. Their reported overall sensitivity of 60.8% is inflated by its performance in ICU patients, and in ED patients sensitivity is only 46.7%.
  • Specificity is OK, at 72.0% overall and 81.3% in the ED. However, the incidence of mortality from sepsis is usually low enough in a general ED population the positive predictive value will be fairly weak.
  • In their comparative cohort for SIRS, which is frankly probably irrelevant because SIRS is already well-described, the expected results of higher sensitivity and lower specificity were observed.

Their general conclusion, to which I generally agree, is qSOFA is not an appropriate general screening tool. They did not add much from a further editorial standpoint – so, rather than let our own draft manuscript for this same meta-analysis and systematic review languish unseen, here is an abridged version of the Discussion section of our manuscript written by myself, Rory Spiegel, and Jeremy Faust:

This analysis demonstrates qualitatively similar findings as those observed in the original derivation study performed by Seymour et al. We find our pooled AUC, however, to be lower than the 0.81 reported in their derivation and validation cohort, as well as the 0.78 reported in two external validation cohorts. The meaning of this difference is difficult to interpret, as the clinical utility of this instrument is derived from its use as a binary cut-off, rather than an ordinal AUC. Our sensitivity and specificity from our primary analysis, respectively, compare favorably to their reported 55% and 84%. We also found qSOFA’s predictive capabilities remained robust when exposed to our sensitivity analyses. When only studies at low risk for bias were included, qSOFA’s performance improved.

While our evaluation of SIRS is limited by restricting the comparison solely to those studies which contemporaneously reported qSOFA, our results are broadly consistent with results previously reported. The SIRS criteria at the commonly used cut-off benefits from superior sensitivity for mortality in those with suspected infection, while its specificity is clearly lacking due to its impaired capability to distinguish between clinically important immune system dysregulation and normal host responses to physiologic stress. The important discussion, therefore, is whether and how to incorporate each of these tools – and others, such as the Modified Early Warning Score or National Early Warning Score – into clinical practice, guidelines, and quality measures.

The current approach to sepsis revolves around the perceived significant morbidity and mortality associated with under-recognized sepsis, favoring screening tools whose purpose is minimizing missed diagnoses. Current sepsis algorithms typically rely upon SIRS, depending on its maximal catchment at the expense of over-triage. Such maximal catchment almost certainly represents a low-value approach to sepsis, considering the in-hospital mortality of patients in our cohort with ≥2 SIRS criteria is not meaningfully different than the overall mortality of the entire cohort. The subsequent fundamental question, however, is whether qSOFA and its role in the new sepsis definitions provides a structure for improvement.

Using qSOFA as designed with its cut-off of ≥2, it should be clear its sensitivity does not support its use as an early screening tool, despite its simplicity and exclusion of laboratory measures. However, in a cohort with suspected infection and some physiologic manifestations of sepsis, e.g., SIRS, the true value of qSOFA may be in prioritizing a subgroup for early clinical evaluation. In a healthcare system with unlimited resources, it may be feasible to give each patient uncompromising evaluation and care. Absent that, we must hew towards an idealized approach, where our resources are directed towards those highest-yield patients for whom time-sensitive interventions modify downstream outcomes.

Less discussed are the direct, patient-oriented harms resulting from falsely-positive screening tools and over-enrollment into sepsis bundles. Recent data suggests benefits from shorter time-to-antibiotics administration intervals are realized primarily in critically ill patients. As such, utilization of overly sensitive tools, such as the SIRS criteria, would lead to over-triage and over-treatment, leading to potential iatrogenic harms in excess of net benefits. These harms include effects on individual and community patterns of antibiotic resistance, as exposure to broad-spectrum antibiotics leads to induction of extended-spectrum beta-lactamase resistance in gram-negative pathogens or vancomycin- and carbapenem-resistance in enterococci. Unnecessary antibiotic exposures lead to excess cases of C. difficile infections. The aggressive fluid resuscitation mandated by sepsis bundles leads to metabolic derangement and potential respiratory impairment. Further research should assess the extent of these harms, and in what measure they counterbalance those benefiting from time-sensitive interventions.

This meta-analysis has several limitations. First, we were limited by the relative dearth of high quality prospective data; most of the studies included in our analysis were retrospective. Second, we restricted our prognostic analyses to mortality alone, rather than diagnosis of sepsis. We chose to analyze only mortality because of competing sepsis definitions among expert bodies and government-issued guidelines. Among them, however, mortality is a common feature, the most objective metric, and manifestly the most important patient-centered outcome. Our analysis would not capture other important sequelae of sepsis, including amputation, loss of neurologic and/or independent function, chronic pain, and prolonged psychiatric effects of substantial critical illness. Third, we do not know whether patients included in these studies were septic on presentation, or developed sepsis later in their hospitalization. This may degrade the accuracy assessment of both SIRS and qSOFA. Fourth, while we know that qSOFA alone may miss some cases of sepsis that SIRS might detect, we do not know how many would, in reality, have been deprived of antibiotics and other necessary treatments. In other words, the fate of “qSOFA negative” patients who were evaluated and treated by physicians qualified to detect and treat critical illness via clinical acumen is not known; nor it should not be presumed that all such patients would have necessarily been deprived of timely treatment. Our analysis and comparison of SIRS is definitively incomplete, and not the most reliable estimate of its diagnostic characteristics, but provided for incidental comparison.

The prudent clinical role for qSOFA, however, is as yet undefined, and these data do not offer insight regarding its superiority to clinician judgment for determining a cohort at greatest risk for poor outcomes. Compared with SIRS, at least, those patients identified by qSOFA likely better represent the subset of patients for whom aggressive early treatment confers a particular advantage, and may drive high-value care in the sepsis arena. Future research should assist clinicians in further individualizing initial treatment of sepsis for those stratified to differing levels of risk for poor outcome, as well as to account for the iatrogenic harms and system costs.

“Prognostic Accuracy of the Quick Sequential Organ Failure Assessment
for Mortality in Patients With Suspected Infection: A Systematic Review and Meta-analysis”
http://annals.org/aim/fullarticle/2671919/prognostic-accuracy-quick-sequential-organ-failure-assessment-mortality-patients-suspected