Blood Cultures Save Lives and Other Pearls of Wisdom

It’s been sixteen years since the introduction of Early Goal-Directed Therapy in the Emergency Department. For the past decade and a half, our lives have been turned upside-down by quality measures tied to the elements of this bundle. Remember when every patient with sepsis was mandated to receive a central line? How great were the costs – in real, in time, and in actual harms from these well-intentioned yet erroneous directives based off a single trial?

Regardless, thanks to the various follow-ups testing strict protocolization against the spectrum of timely recognition and aggressive intervention, we’ve come a long way. However, there are still mandates incorporating the vestiges of such elements of care –such as those introduced by the New York State Department of Health. Patients diagnosed with severe sepsis or septic shock are required to complete protocols consisting of 3-hour and 6-hour bundles including blood cultures, antibiotics, and intravenous fluids, among others.

This article, from the New England Journal, looks retrospectively at the mortality rates associated with completion of these various elements. Stratified by time-to-completion following initiation of the 3-hour bundle within 6 hours of arrival to the Emergency Department, these authors looked at the mortality associations of the bundle elements.

Winners: obtaining blood cultures, administering antibiotics, and measuring serum lactate
Losers: time to completion of a bolus of intravenous fluids

Of course, since blood cultures are obtained prior to antibiotic administration, these outcomes are co-linear – and they don’t actually save lives, as facetiously suggested in the post heading. But, antibiotic administration was associated with a fraction of a percent of increased mortality per hour delay over the first 12 hours after initiation of the bundle. Intravenous fluid administration, however, showed no apparent association with mortality.

These data are fraught with issues, of course, relating to their retrospective nature and the limitations of the underlying data collection. Their adjusted model accounts for a handful of features, but there are still potential confounders influencing mortality of those who received their bundle completion within 3 hours as compared to those who did not. The authors are appropriately reserved with their conclusions, however, only stating these observational data support associations between mortality and antibiotic administration, and do not extend to any causal inferences.

The lack of an association between intravenous fluids and mortality, however, raises significant questions requiring further prospective investigation. Could it be, after these years wandering in the wilderness with such aggressive protocols, the only universally key feature is the initiation of appropriate antibiotics? Do our intravenous fluids, given without regard to individual patient factors, simply harm as many as they help, resulting in no net benefit?

These questions will need to be addressed in randomized controlled trials before the next level of evolution in our approach to sepsis, but the equipoise for such trials may now exist – to complete our journey from Early Goal-Directed to Source Control and Patient-Centered.

“Time to Treatment and Mortality during Mandated Emergency Care for Sepsis”

You’ve Got (Troponin) Mail

It’s tragic, of course, no one in this generation will understand the epiphany of logging on to America Online and being greeted by its almost synonymous greeting “You’ve got mail!” But, we and future generations may bear witness to the advent of something almost as profoundly uplifting: text-message troponin results.

These authors conceived and describe a fairly simple intervention in which test results – in this case, troponin – were pushed to clinicians’ phones as text messages. In a pilot and cluster-randomized trial with 1,105 patients for final analysis, these authors find the median interval from troponin result to disposition decision was 94 minutes in a control group, as compared with 68 minutes in the intervention cohort. However, a smaller difference in median overall length of stay did not reach statistical significance.

Now, I like this idea – even though this is clearly not the study showing generalizable definitive benefit. For many patient encounters, there is some readily identifiable bottleneck result of greatest importance for disposition. If a reasonable, curated list of these results are pushed to a mobile device, there is an obvious time savings with regard to manually pulling these results from the electronic health record.

In this study, however, the median LOS for these patients was over five hours – and their median LOS for all patients receiving at least one troponin was nearly 7.5 hours. The relative effect size, then, is really quite small. Next, there are always concerns relating to interruptions and unintended consequences on cognitive burden. Finally, it logically follows if this text message derives some of its beneficial effect by altering task priorities, then some other process in the Emergency Department is having its completion time increased.

I expect, if implemented in a typically efficient ED, the net result of any improvement might only be a few minutes saved across all encounter types – but multiplied across thousands of patient visits for chest pain, it’s still worth considering.

“Push-Alert Notification of Troponin Results to Physician Smartphones Reduces the Time to Discharge Emergency Department Patients: A Randomized Controlled Trial”

Correct, Endovascular Therapy Does Not Benefit All Patients

Unfortunately, that headline is the strongest takeaway available from these data.

Currently, endovascular therapy for stroke is recommended for all patients with a proximal arterial occlusion and can be treated within six hours. The much-ballyhooed “number needed to treat” for benefit is approximately five, and we have authors generating nonsensical literature with titles such as “Endovascular therapy for ischemic stroke: Save a minute—save a week” based on statistical calisthenics from this treatment effect.

But, anyone actually responsible for making decisions for these patients understands this is an average treatment effect. The profound improvements of a handful of patients with the most favorable treatment profiles obfuscate the limited benefit derived by the majority of those potentially eligible.

These authors have endeavored to apply a bit of precision medicine to the decision regarding endovascular intervention. Using ordinal logistic regression modeling, these authors used the MR CLEAN data to create a predictive model for good outcome (mRS score 0-2 at 90 days). These authors subsequently used the IMS-III data as their validation cohort. The final model displayed a C-statistic of 0.69 for the ordinal model and 0.73 for good functional outcome – which is to say, the output is closer to a coin flip than a informative prediction for use in clinical practice.

More importantly, however, is whether the substrate for the model is anachronistic, limiting its generalizability to modern practice. Beyond MR CLEAN, subsequent trials have demonstrated the importance of underlying tissue viability using either CT perfusion or MRI-based selection criteria when making treatment decisions. Their model is limited in its inclusion of just a measure of collateral circulation on angiogram, which is only a surrogate for potential tissue viability. Furthermore, the MR CLEAN cohort is comprised of only 500 patients, and the IMS-III validation only 260. This sample is far too small to properly develop a model for such a heterogenous set of patients as those presenting with proximal cerebrovascular occlusion. Finally, the choice of logistic regression can be debated, simply from a model standpoint, given its assumptions about underlying linear relationships in the data.

I appreciate the attempt to improve outcomes prediction for individual patients, particularly for a resource-intensive therapy such as endovascular intervention in stroke. Unfortunately, I feel the fundamental limitations of their model invalidate its clinical utility.

“Selection of patients for intra-arterial treatment for acute ischaemic stroke: development and validation of a clinical decision tool in two randomised trials”

An Uninsightful Look at Traumatic ICH in Ground Level Falls

The ground is ubiquitous. There are many ways to injure oneself, but the typical readily available impact surface is the ground. The ground is particularly pernicious, it seems, in the elderly and those in assisted care facilities. Thus, we have a great number of patients for whom imaging decisions must be made in elderly patients who have fallen from, apparently, “ground-level”.

Many of these same elderly patients have multiple medical comorbidities, including those for whom antiplatelet or anticoagulant therapy is indicated. These patients are, then, at elevated risk for intracranial hemorrhage despite the apparent low mechanism of injury. Wouldn’t it be lovely if we had better descriptive data with which to estimate and determine those at greatest risk?

Unfortunately, this fundamentally flawed observational study design tells us quite little. These authors included every patient whose electronic health record included antiplatelet and anticoagulant medications, and subsequently had intracranial imaging ordered. The EHR, then, prospectively prompted clinicians to indicate “ground-level fall” as their mechanism of injury. Of 668 patients on antiplatelets, 29 (4.3%) demonstrated ICH on CT. Of 180 patients on anticoagulants, 3 (1.7%) suffered ICH. Another 91 were on some sort of combined treatment, and 1 (1.1%) suffered ICH.

And this tells us nothing, other than the risk of ICH is non-zero. Even from a simple frequentist statistical standpoint, the sample sizes are small enough the confidence intervals around these numbers are quite wide. Then, there is the problem of their screening methods – which starts after the decision has been made to perform CT. Unless it is specifically protocolized all patients with ground-level fall are mandated to perform CT, decisions to initiate imaging would depend on the selection bias of individual clinicians. Individual perceptions of the risk of ICH on antiplatelet and anticoagulant medications dramatically impact the rate of imaging – so this ultimately only tells us the risk for ICH in their uniquely selected population.  Additionally, without structured follow-up of those not imaged, neither the numerator nor the denominator are reliable in this estimate.

These patients fall out of all of our decision support instruments, and it would be lovely to have better information regarding their true risk and specific predisposing factors in order to be better stewards of imaging resources and costs. These data unfortunately do not add much to our decision-making substrate.

“Risk of Intracranial Hemorrhage in Ground Level Fall with Antiplatelet or Anticoagulant Agents”

Use HEART, Or Whatever

The HEART score receives a lot of favorable press these days. It generally has face validity. It is probably superior in terms of discriminatory ability versus our venerable candidates such as TIMI and GRACE. It has been well-evaluated in multiple practice settings with reliable predictive value.

But, the final question for a decision instrument distilling a complex clinical scenario down to a five-question substrate for guiding evaluation and disposition – does it safely improve practice?

The answer is no – if you’re Dutch, in these Dutch hospitals.

In a stepped-wedge, cluster-randomized trial, these authors evaluated the effect of using HEART on patient outcomes and healthcare resource utilization. The three HEART risk categories carry general practice recommendations, in which low-risk (0-3) suggest early discharge, intermediate-risk (4-6) noninvasive testing, and high-risk (7-10) early invasive strategies. The comparator, “usual care” was, well, as usual.

With two cohorts comprising approximately 1,800 patients each, there were probably no reliable differences in care or outcomes demonstrated. The HEART low-risk cohort had a 2.0% 30-day incidence of MACE, which is similar to the safety profile described in other studies. However, the real goal of this evaluation was to determine acceptability and impact on resource utilization – and those results are decidedly mixed. Similar rates of early discharge from the ED, ED observation, inpatient admission, and downstream outpatient utilization were observed between the HEART cohort and usual care.

But, this answer from above – no impact on practice – is argued by these authors to be mostly related to non-adherence to the protocol recommendations. Most importantly, they note nearly a third of their low-risk patients were kept for prolonged ED or chest pain unit observation, and a handful more were admitted. The authors suggest there may be room for improvement in resource utilization, but they encountered entrenched cultural practice barriers.

This study was conducted between July 2013 and August 2014 – a long time ago, before most had heard of HEART. It is reasonable to suggest clinicians would now be more comfortable using this score for early discharge from the Emergency Department than during the trial period. It is probably also reasonable to suggest a more robust cultural effort backing practice change would improve adherence to recommendations – a collective departmental agreement associated with educational initiatives. Finally, usual care entailed early discharge of nearly 50% of all patients with chest pain, so your local baseline will affect whether a HEART-based protocol demonstrates improvement.

While these results in this trial are generally negative, what we see here is probably the floor for the effect of HEART on practice. At a minimum, it is as safe as advertised, and probably has room to demonstrate more robust beneficial effects on practice.

“Effect of Using the HEART Score in Patients With Chest Pain in the Emergency Department”

Symptoms Over Science

There’s a reason general primary care has evolved to diagnose and treat uncomplicated urinary tract infections over the phone: the patient is the authority, not any test we order.

We’ve tried relying upon some constellation of the urinalysis, the urine microscopic examination, and, finally, the urine culture. Each of these has limitations, although, in many settings, the culture result has been the gold standard. However, this culture result, some quantification of the number of colony-forming units, is also somewhat of an arbitrary diagnostic – an arbitrary numerical cut-off must be used, with its own implications for sensitivity and specificity.

This brief clinical microbiology article evaluates the urine culture as a gold standard for the diagnosis of UTI by comparing it with polymerase chain reaction-based methods for measuring the presence of pathogenic bacteria. Based on 86 asymptomatic women and 220 general practice women complaining of UTI symptoms, these authors compared the number of positive culture results with positive PCR results. Of this sample, 149 had positive cultures for e. coli, while 211 patients had positive PCR for e. coli. Finally, combining the culture results – which identified other pathogens, as well – with the PCR for e. coli, 216 of 220 symptomatic women had pathogenic bacteria identified. In the control cohort, there were similar numbers of positive culture and PCR results – ~10% in each, which these authors feel accurately reflects the general rate of asymptomatic bacteruria in the general population.

These data correlate nicely with similar findings demonstrating a negative urine culture does not exclude clinical improvement while on antibiotics, and thus the reasonable conclusion we ought simply treat appropriate symptomatic patients without specifically relying on testing.

“Women with symptoms of a urinary tract infection but a negative urine culture: PCR-based quantification of Escherichia coli suggests infection in most cases”

Tranexamic Acid & The WOMAN Trial

Tranexamic acid is popular for the treatment of freckles and nosebleeds – oh, and major bleeding in the setting of trauma. But, originally, the drug was developed for use in controlling hemorrhage in obstetrics and gynecology. Finally, then, we have a trial examining its use for its intended purpose.

Comprising 20,060 patients with clinically significant post-partum hemorrhage across 193 hospitals in 21 countries, the WOMAN trial is – inconveniently – negative as originally designed. The initial study design called for 15,000 patients and a composite endpoint of hysterectomy or death within six weeks of childbirth. However, as the study progressed, it was clear the standard practice in the settings involved indicated the intervention was going to have no effect on hysterectomy rates, and the trial was then expanded to examine the effect on mortality.

So, then, with their expanded sample size, does TXA save lives, as reported profusely throughout the lay media?


Mortality within 6 weeks was 2.3% in the TXA cohort and 2.6% with placebo a relative risk of 0.88 (0.74-1.05).

There is, however, some layered complexity in these outcomes. Broken down by cause of death, deaths due to bleeding were 1.5% in the TXA cohort compared with 1.9% with placebo, reaching “statistical significance” with a p-value of 0.045. Then, if you further unpack these results, it seems even within the TXA cohort there is probably a time-to-treatment effect similar to CRASH-2.  Mortality was 1.2% in those receiving their TXA within 3 hours compared with 1.7% treated with placebo. In those treated beyond 3 hours, there was no difference in outcomes – and much higher mortality, regardless (2.6% vs. 2.5%).

So, what should we take away from these data? Is TXA more than just a treatment for freckles, or are these authors and the lay media exaggerating secondary outcomes in the setting of an overall negative trial? As usual, the answer is a little bit of both. The magnitude of the treatment effect, considering the size of this trial, is very, very small. That said, death is a quite meaningful clinical outcome, TXA is fairly inexpensive, and no specific harms were detected in this trial. Therefore, in the settings in which this trial was conducted – Nigeria, Pakistan, Sudan, Albania, etc. – this is likely an important treatment for post-partum hemorrhage.

In more robust clinical settings where additional resources are typically available to support the resuscitation of women suffering bleeding complications from childbirth, the effect size on mortality is likely even much smaller. There may be clinically important effects regarding hysterectomy, hemostasis, and reduction in transfusion utilization, but I again suspect they will be very small and difficult to quantify without a similarly large trial. Then, as the NNT increases for clinically important outcomes, even the very rare harms of a treatment become relevant – and failure of this trial to detect harms may simply be a limit of its statistical power.

Ultimately, as the mortality benefit decreases, the range of acceptable practice variation for protocols incorporating TXA increases.  This is an important trial – but, as typically, not quite as breathlessly so.

“Effect of early tranexamic acid administration on mortality, hysterectomy, and other morbidities in women with post-partum haemorrhage (WOMAN): an international, randomised, double-blind, placebo-controlled trial”

No Change in Ordering Despite Cost Information

Everyone hates the nanny state. When the electronic health record alerts and interrupts clinicians incessantly with decision-“support”, it results in all manner of deleterious unintended consequences. Passive, contextual decision-support has the advantage of avoiding this intrusiveness – but is it effective?

It probably depends on the application, but in this trial, it was not. This is the PRICE (Pragmatic Randomized Introduction of Cost data through the Electronic health record) trial, in which 75 inpatient laboratory tests were randomized to display of usual ordering, or ordering with contextual Medicare cost information. The hope and study hypothesis was the availability of this financial interest would exert a cultural pressure of sorts on clinicians to order fewer tests, particularly those with high costs.

Across three Philadelphia-area hospitals comprising 142,921 hospital admissions in a two-year study period, there were no meaningful differences in lab tests ordered per patient day in the intervention or the control. Looking at various subgroups of patients, it is also unlikely there were particularly advantageous effects in any specific population.

Interestingly, one piece of feedback the authors report is the residents suggest most of their routine lab test ordering resulted from admission order sets. “Routine” daily labs are set in motion at the time of admission, not part of a daily assessment of need, and thus a natural impediment to improving low-value testing. However, the authors also note – and this is probably most accurate – because the cost information was displayed ubiquitously, physicians likely became numb to the intervention. It is reasonable to expect substantially more selective cost information could have focused effects on an adea of particularly high cost or low-value.

“Effect of a Price Transparency Intervention in the Electronic Health Record on Clinician Ordering of Inpatient Laboratory Tests”

Dexamethasone Dilemma

Look! On Twitter! Two highly-respected medical minds taking the same trial publication and producing two, very different responses:

The controversy stems from a small study examining the relatively common practice of treating pharyngitis with an oral steroid – usually dexamethasone – for its anti-inflammatory effect. Most pharyngitis does not require antibiotics, and physicians understandably prefer to try something to provide relief from suffering.

This study enrolled 576 patients in a randomized, placebo-controlled, double-blind trial in an outpatient general practice setting. Patients were provided either 10mg of oral dexamethasone or an identical lactose placebo. Patients could only enter into the trial if immediate antibiotics were not prescribed, but physicians were allowed to give a “delayed” prescription for failure to improve.

The trial is statistically negative for the primary outcome, complete resolution of sore throat at 24 hours. Of those assigned to dexamethasone, 22.6% had complete symptom resolution at 24 hours, compared with 17.7% of placebo, an absolute risk difference of 4.7% (-1.8 to 11.2)[sic]. The effect size is slightly larger at 48 hours, 8.7%, which does reach statistical significance – and thus the NNT noted above by Ian Stiell. Nearly all the other secondary outcomes – resource utilization, subsequent antibiotic use, use of pain relief – favor dexamethasone, but generally range in effect size between 1-4%.

Does the failure to meet statistical significance for the primary outcome refute this therapy as effective? Not hardly – but it certainly calls into question whether the difference is reproducible or clinically meaningful. Plug these data into Ioannidis’ framework regarding the reliability of research findings, and we see this is precisely the sort of work where both conclusions are reasonable. Is there a signal for a symptomatic benefit? Absolutely. The strength of the signal, however, is not strong enough to overcome whatever pre-study odds you placed on the treatment being successful. If, like many, you feel this is a treatment likely beneficial, this study appears confirmatory. If, like many, you feel systemic steroids for symptomatic pharyngitis is inane, this study does little to change your view of the inadequate risk/benefit ratio.

Another possible interpretation of these data is the possibility of variable effects within subgroups, where the entire small effect size seen in these data results from a more substantial effect size in some fraction of the cohort. For example, the mean duration of symptoms was ~3.9 days, with a SD of ~1.7 days. Could the recency of symptoms be associated with likelihood of benefit? Any secondary analyses such as these, particularly in a small trial like this, would only serve as fodder for future investigations.

I have seen, however, other folks using this as an opportunity to link to the recent BMJ publication regarding adverse events and corticosteroid exposures. Without delving into that publication in detail, it would be a mistake to generalize those data to this population. That said, systemic corticosteroids are certainly not harmless. These authors rather ludicrously state “Short courses of oral steroids have been shown to be safe, in the absence of contraindications” – justified by a citation from 1982.

The final answer is somewhere in between our two friends above. Dexamethasone will help some patients with symptom relief from pharyngitis, and it will harm some.  Teasing out a prediction of the optimal risk/benefit for a patient is substantially challenging – and wide practice variation is justifiable from these data, as long as it is acknowledged the uncertainty in the evidence base.

“Effect of Oral Dexamethasone Without Immediate Antibiotics vs Placebo on Acute Sore Throat in Adults”