The Futility of NSAIDs for Back Pain?

This article filled with reproach for non-steroidal anti-inflammatories was highlighted in a New England Journal of Medicine Journal Watch and on Twitter – a wistful treatise remarking on the general ineffectiveness of pharmacologic analgesics. “Nothing works!” accompanied by a general gnashing of teeth and writhing on invisible flames.

But – does this meta-analysis actually reach such a conclusion? Examine the first few words in their conclusion:

NSAIDs are effective for spinal pain …

Off to a good start! But, the catch:

… but the magnitude of the difference in outcomes between the intervention and placebo groups is not clinically important.

These authors pool the results of 35 randomized, placebo-controlled trials for “spinal pain”, which is to say undifferentiated pain relating anatomically to any part of the spine. These trials comprised 6,065 participants – or, if you do the math, an average of 173 patients per trial, nearly all of them performed over a decade ago. The pooled effects of these trials all favored NSAIDs – but, as the authors mention, the absolute magnitude of effect on pain scales was a the edge of their threshold for clinical significance. The authors defined a difference of 10 points on a 100-point scale as clinically important, but most of their pooled results landed between -7 and -16, favoring NSAIDs over placebo. With these small samples, generally moderate GRADE quality, and moderate to high heterogeneity between the pooled results, there is a lot of fuzziness around their ultimate conclusion.

These authors do many, other, exploratory analyses, and it is reasonable to suggest the limitations inherent to each render any conclusions unreliable. Adverse events, as reported, were similar between groups – excepting for increased gastrointestinal adverse events, most of which were non-serious. The authors report this difference as a relative risk of 2.5 for GI side effects in their comparison, but the absolute differences are on the order of an excess of 1 in 100.

This is probably much ado about nothing. Their perspective is not inaccurate, per se, but these trials do find a consistent benefit to NSAIDs. The value judgment here on clinical effectiveness probably misses the mark, particularly considering these are inexpensive, readily available, with few adverse effects in short-term use. I would probably argue it is easier to defend a position they still have utility in multi-modal pain control regimens, rather than to conclude they be consigned to the rubbish bin.

“Non-steroidal anti-inflammatory drugs for spinal pain: a systematic review and meta-analysis”


Blood Cultures Save Lives and Other Pearls of Wisdom

It’s been sixteen years since the introduction of Early Goal-Directed Therapy in the Emergency Department. For the past decade and a half, our lives have been turned upside-down by quality measures tied to the elements of this bundle. Remember when every patient with sepsis was mandated to receive a central line? How great were the costs – in real, in time, and in actual harms from these well-intentioned yet erroneous directives based off a single trial?

Regardless, thanks to the various follow-ups testing strict protocolization against the spectrum of timely recognition and aggressive intervention, we’ve come a long way. However, there are still mandates incorporating the vestiges of such elements of care –such as those introduced by the New York State Department of Health. Patients diagnosed with severe sepsis or septic shock are required to complete protocols consisting of 3-hour and 6-hour bundles including blood cultures, antibiotics, and intravenous fluids, among others.

This article, from the New England Journal, looks retrospectively at the mortality rates associated with completion of these various elements. Stratified by time-to-completion following initiation of the 3-hour bundle within 6 hours of arrival to the Emergency Department, these authors looked at the mortality associations of the bundle elements.

Winners: obtaining blood cultures, administering antibiotics, and measuring serum lactate
Losers: time to completion of a bolus of intravenous fluids

Of course, since blood cultures are obtained prior to antibiotic administration, these outcomes are co-linear – and they don’t actually save lives, as facetiously suggested in the post heading. But, antibiotic administration was associated with a fraction of a percent of increased mortality per hour delay over the first 12 hours after initiation of the bundle. Intravenous fluid administration, however, showed no apparent association with mortality.

These data are fraught with issues, of course, relating to their retrospective nature and the limitations of the underlying data collection. Their adjusted model accounts for a handful of features, but there are still potential confounders influencing mortality of those who received their bundle completion within 3 hours as compared to those who did not.  The differences in mortality, while a hard and important endpoint, are quite small.  Earlier is probably better, but the individual magnitude of benefit will be unevenly distributed around the average benefit, and while a delay of several hours might matter, minutes probably do not.  The authors are appropriately reserved with their conclusions, however, only stating these observational data support associations between mortality and antibiotic administration, and do not extend to any causal inferences.

The lack of an association between intravenous fluids and mortality, however, raises significant questions requiring further prospective investigation. Could it be, after these years wandering in the wilderness with such aggressive protocols, the only universally key feature is the initiation of appropriate antibiotics? Do our intravenous fluids, given without regard to individual patient factors, simply harm as many as they help, resulting in no net benefit?

These questions will need to be addressed in randomized controlled trials before the next level of evolution in our approach to sepsis, but the equipoise for such trials may now exist – to complete our journey from Early Goal-Directed to Source Control and Patient-Centered.  The difficulty will be, again, in pushing back against well-meaning but ill-conceived quality measures whose net effect on Emergency Department resource utilization may be harm, with only small benefits to a subset of critically ill patients with sepsis.

“Time to Treatment and Mortality during Mandated Emergency Care for Sepsis”

Just the Cost of Doing Business

Good news, everyone!

In the past two decades, for virtually every specialty, the number of paid medical malpractice claims has decreased. Overall, for all specialties, the rate of payment has been halved, compared with the 1992-1996 timeframe. Neurosurgery, unfortunately, is still the “winner”, followed by plastic surgery, thoracic surgery, and obstetrics. The lowest rates were seen in psychiatry and pediatrics. Emergency medicine sits right in the middle, with 18.8 paid claims per 1,000 physician years.

The bad news, unfortunately, was that the claim amounts – including paid claims greater than $1 million – increased. Emergency Medicine paid claim amounts increased 26.1% to a mean of $314,052 in the most recent time period of analysis, an increase in line with the overall mean for all specialties. The largest jump in payout amounts was essentially a tie between dermatology, gastroenterology, pathology, and urology. Neurosurgery actually had one of the lowest payout increases – probably because they started from such lofty heights, already.

Types of malpractice alleged varied by specialty, with the expected variation between diagnostic error, surgical error, and treatment errors between the diagnostic and surgical specialties. Most (63.6%) of malpractice alleged in emergency medicine fell into alleged diagnostic error, while logically 73.3% of alleged error in plastic surgery fell under surgical error.

These data, from the National Provider Data Bank, only document payments made for written claims and do not include settlements or monies paid out by institutions. Whether these actually represent a friendlier environment for physicians, more aggressive approaches to settling claims, or a shifting of liability to corporate proxy is not clear. Regardless, even if it is a little of all three, the trend is probably moving in the right direction.

“Rates and Characteristics of Paid Malpractice Claims Among US Physicians by Specialty, 1992-2014”

Discharged and Dropped Dead

The Emergency Department is a land of uncertainty. Generally a time-compressed, zero-continuity environment with limited resources, we frequently need to make relatively rapid decisions based on incomplete information. The goal, in general, is to treat and disposition patients in an advantageous fashion to prevent morbidity and mortality, while minimizing the costs and other harms.

The consequence of this confluence of factors leads, unfortunately, to a handful of patients who meet their unfortunate end following discharge. A Kaiser Permanente Emergency Department cohort analysis found 0.05% died within 7 days of discharge, and identified a few interesting risk factors regarding their outcomes. This new article, in the BMJ, describes the outcomes of a Medicare cohort following discharge – and finds both similarities and differences.

One notable difference, and a focus of the authors, is that 0.12% of patients discharged from the Emergency Department died within 7 days. This is a much larger proportion than the Kaiser cohort, however, the Medicare population is obviously a much older cohort with greater comorbidities. Then, they found similarities regarding the risks for death – most prominently, “altered mental status”. The full accounting of clinical features is described in the figure below:

Then, there were some system-level factors as well. Potentially, rural emergency departments and those with low annual volumes contributed in their multivariate model to increased risk of death. This data set is insufficient to draw any specific conclusions regarding these contributing factors, but it raises questions for future research. In general, however, this is interesting – and not terribly surprising data – even if it is hard to identify specific operational interventions based on these broad strokes.

“Early death after discharge from emergency departments: analysis of national US insurance claims data”

Insight Is Insufficient

In this depressing trial, we witness a disheartening truth – physicians won’t necessarily do better, even if they know they’re not doing well.

This study tested a mixed educational and peer comparison intervention on primary care physicians in Switzerland, with an end goal of improving antibiotic stewardship for common ambulatory complaints. The “worst-performing” 2,900 physicians with respect to antibiotic prescribing rates were enrolled and randomized to the study intervention or none. The study intervention consisted of materials regarding appropriate prescribing, along with personalized feedback regarding where their prescribing rate ranked compared to the entire national cohort. The core of their hypothesis involved whether just this passive knowledge regarding their peer performance would exert normalizing influence over their practice.

Unfortunately, despite providing these physicians with this insight, as well as tools for improvement, the net effect of their intervention was effectively zero. There were some observations regarding changes in prescribing rates for certain age groups, and for certain types of antibiotics, but dredging through these secondary outcomes leads to only unreliable conclusions.

This is not particularly surprising data. These sorts of passive feedback mechanisms unhitched from material consequences have never previously been shown to be effective. There are other, more effective mechanisms – focused education, decision-support interventions, and shared decision-making – but, for a fragmented, national health system, this represented a relatively inexpensive model to test.

Try again!

“Personalized Prescription Feedback Using Routinely Collected Data to Reduce Antibiotic Use in Primary Care”

The Downside of Antibiotic Stewardship

There are many advantages to curtailing antibiotic prescribing. Costs are reduced, fewer antibiotic-resistant bacteria are induced, and treatment-associated adverse events are eliminated.

This retrospective, population-based study, however, illuminates the potential drawbacks. Using electronic record review spanning 10 years of general practice encounters, these authors compared infectious complication rates between practices with low and high antibiotic prescribing rates. Spanning 45.5 million person-years of follow-up after office visits for respiratory tract infections, there is both reason for reassurance and reason for further concern.

On the “pro” side, cases of mastoiditis, empyema, bacterial meningitis, intracranial abscess and Lemierre’s syndrome were no different between those who prescribed high rates (>58%) and those with low rates (<44%). However, there is a reasonably clear linear relationship with excess follow-up encounters for both pneumonia and peritonsilar abscess. Incidence rate ratios were 0.70 compared with reference for pneumonia and 0.78 compared with reference for peritonsillar abscess. However, the absolute differences can best be described as “large handful” and “small handful” of extra cases per 100,000 encounters

There are many rough edges and flaws relating to these data, some of which are probably adequately defeated by the massive cohort size. I think it is reasonable to interpret this article as accurately reflecting true harms from antibiotic stewardship. More work should absolutely be pursued in terms of strategies to mitigate these potential downstream complications, but I believe the balance of benefits and harms still falls on the side of continued efforts in stewardship.

“Safety of reduced antibiotic prescribing for self limiting respiratory tract infections in primary care: cohort study using electronic health records”

How Many ED Visits are Truly Inappropriate?

I’ve seen quite a bit of feedback on social media regarding this research letter in JAMA Internal Medicine.

This study evaluated, using National Hospital Ambulatory Medical Care Survey data, the incidence of hospital admission stratified by triage Emergency Severity Index.  They analyzed 59,293 representative visits from the sample and found 7.5% of them, on a weighted basis, were categorized as “non-urgent” – an ESI level 5 or presumed equivalent.  The typical assumption regarding these non-urgent visits is they represent inappropriate Emergency Department utilization.  This study found, however:

“… a nontrivial proportion of ED visits that were deemed nonurgent arrived by ambulance, received diagnostic services, had procedures performed, and were admitted to the hospital, including to critical care units.”

There are always limitations regarding the NHAMCS data, particularly with missing and imputed data.  Based on this, I tend to feel these data lack face validity.  The weighted incidence of admission for non-urgent patients was 4.4% compared with 12.8% of urgent visits, while 0.7% of non-urgent visits were to critical care units compared with 1.3% of urgent visits.  I certainly do not see similar relative proportions of admission, and then to critical care, for level 5 patients in my multiple practice environments.

Regardless, the general implication made by these authors is probably reasonable, refuting usage of ESI triage level 5 to accurately represent inappropriate Emergency Department visits.  However, left equally unstated, is an acknowledgement that ESI also fails to accurately categorize urgent visits – which ties to the rhetoric of trying to conflate “non-urgent” as “inappropriate and “urgent” as “appropriate”.

ESI, as currently implemented, will not be a reliable tool for directing patients to other sources of care – but, with some fuzziness, probably still gives a reasonable estimate of the overall burden of inappropriate ED visits for some policy applications.

“Urgent Care Needs Among Nonurgent Visits to the Emergency Department”

Too Many Tests! Or, So We Believe ….

Yes, Virginia, we order too many tests.  And, we know it – as evidenced by such conferences on overdiagnosis and costs of care.  And, even more relevant than such academic exercises, as this study indicates, even the general clinician seems to have a fair bit of self-awareness.

In this survey consisting of 435 respondents, 85% of emergency physicians believed excessive testing occurred in their Emergency Department.  Most frequently, such testing was motived by fear of missing even rare diagnoses, but defensive medicine and malpractice came a close second.  Patient expectations, local practice patterns, and time saving were also substantially cited as motivators for ordering.  Thankfully, administrative and personal motivations to increase reimbursement were rarely reported as reasons.

Despite the protestations of some policy-makers, the clinicians surveyed believed the most helpful change to the system would be malpractice reform.  Interestingly, the next ranked helpful interventions included educating patients and increasing shared decision-making.  While the first item may be logistically (or politically) unachievable, there are no obstacles to integrating improved communication behaviors into routine practice.  It does, however, show a need for increased availability of tools for clinicians to use at the point of care.

There are flaws in these sorts of perception-based surveys with regard to the accuracy of such anecdotal self-assessment.  Physician assessment of their own practice and that of others can certainly be questioned.  It must be admitted, however, a more intensive just-in-time surveying method would likely impact the variables measured.

There are also some highly entertaining outliers in Figure 2, of course, the perception of self vs. colleague ordering.  There is a handful of physicians who believe they, themselves, order over 80% of their CTs and MRIs unnecessarily – but that no one else in their group does.  Likewise, there is a handful with just the opposite perception – that their colleagues over-order, while they, themselves rarely do.  I wonder if they work in the same department?

Regardless, first step is admitting you have a problem.  We have many steps yet to go.

“Emergency Physician Perceptions of Medically Unnecessary Advanced Diagnostic Imaging”

Sometimes, The Stick Doesn’t Work

Pressure ulcers, catheter-associated UTIs, central-line infections, and injuries from falls are all iatrogenic injuries associated with healthcare and hospitalization.  Fewer of all these events would be ideal.

Of course, since asking nicely isn’t much of a motivation for healthcare delivery systems to improve practice, Medicare had a different solution – non-payment.  In 2008, Medicare ceased allowing hospitals to claim higher severity diagnosis related group codes to account for costs incurred by eight “never event” complications.  Money, on the other hand, is a strong motivator for change.  This study tries to evaluate just how successful such a heavy stick is at influencing care delivery.

These authors looked at the National Database of Nursing Quality Indicators, counting reported ulcers, falls, CLABSI, and CAUTI occurring between 2006 and 2010.  The trends reported for each differ starkly.  For CLABSI and CAUTI, in the quarters leading up to CMS policy change, the prevalence of each was gradually increasing.  After 2008, however, both trends show abrupt and consistent reversal and downward movement.  For pressure ulcers and injurious falls, however, the prevalence was gradually decreasing at the time of CMS policy implementation, and the slope of the line after 2008 is consistent with that same gradual decline.

The authors go into the limitations of each data source, but, the general takeaway is likely still valid – some “never events” just aren’t consistently, systematically preventable.  There are concerted, teachable best-practices involved with decreasing CLASBI and CAUTI.  Fall prevention and pressure ulcer prevention, on the other hand, are less amenable to care bundles, and seem to depend on gradual cultural changes and vigilance.  Thus, while outcomes-focused quality improvement using a financial motivator, while a reasonable method to try, will probably have the greatest impact and yield where a validated, evidence-based strategy can be implemented.

“Effect of Medicare’s Nonpayment for Hospital-Acquired Conditions Lessons for Future Policy”

The Wholesale Revision of ACEP’s tPA Clinical Policy

ACEP has published a draft version of their new Clinical Policy statement regarding the use of IV tPA in acute ischemic stroke.  As before, the policy statement aims to answer the questions:

(1) Is IV tPA safe and effective for acute ischemic stroke patients if given within 3 hours of symptom onset?
(2) Is IV tPA safe and effective for acute ischemic stroke patients treated between 3 to 4.5 hours after symptom onset?

Most readers of this blog are familiar with the mild uproar the previous version caused, and this revision opens by stating “changes to the ACEP clinical policies development process have been implemented, the grading forms used to rate published research have continued to evolve, and newer research articles have been published.”  Left unsaid, in presumably a bit of diplomacy, were the conflicts of interest befouling the prior work.  Notably absent from this work is any involvement from the American Academy of Neurology.

What’s new, with a new methodology-focused rather than conflicted-expert-opinion approach?  Most obviously, there’s a new Level A recommendation – focused on the only consistent finding across all tPA trials: clinicians must consider a 7% incidence of symptomatic intracranial hemorrhage, compared with 1% in the placebo cohorts.  The previously Level A recommendation to treat within 3 hours has been downgraded to Level B.  Treatment up to 4.5 hours remains Level B.  Finally, a new Level C recommendation includes a consensus statement recommending shared decision-making between the patient and a member of the healthcare team regarding the potential benefits and harms.

Most of the reaction on Twitter has been, essentially, a declaration of victory.  And, in a sense, it is certainly a powerful statement regarding the ability for like-minded patient advocates and evidence purists to coalesce through alternative media and initiate a major change in policy.  To critique this new effort is a bit of punishing the good for lack of manifesting perfect, but there are a number of oddities worth providing feedback to the writing committee:

  • The authors provide a curious statement:  “The 2012 IV tPA clinical policy recommendation to ‘offer’ tPA to patients presenting with acute ischemic stroke within 3 hours of symptom onset was consistent with other national guidelines. Unfortunately, the essence of the term ‘offer’ may have been lost to readers and has therefore been avoided in this revision.”  I rather find “offer” a lovely term, in the sense it expresses a cooperative process for proceeding forward with a mutually agreed upon treatment strategy.  Rather than discard the term, clarification might have been reasonable.
  • They mention ATLANTIS as Class III evidence with regard to the 3-4.5 hour question.  I can see how its classification may be downgraded given the multiple protocol revisions.  That said, its inability to find a treatment benefit in spite of extensive sponsor involvement ought be a more powerful negative weighting than currently acknowledged.  Given the biases favoring the treatment group in ECASS III (given a Class II evidence label), the cumulative evidence probably does not support a Level B recommendation for the 3-4.5 hour window.
  • One of my Australian colleagues in private communication brings up a small letter from Bradley Shy, previously covered on this blog, mentioning a statistical change to ECASS III.  This statement could acknowledge this post-publication correction and its implications regarding the aforementioned imbalance between groups.
  • The authors fail to acknowledge the heterogeneity of acute ischemic stroke syndromes and patient substrates, and the utter paucity of individualized risk or benefit assessment tools – in no small consequence of the small sample sizes of the few trials rated as Class I or Class II evidence.  This is a powerful platform with which to state clinical equipoise exists for continued placebo-controlled randomization.  As we see from the endovascular trials, the acute recanalization rate of IV tPA is as low as 40% – with many patients re-occluding following completion of the infusion.  Patients need to be selected less broadly with respect to likelihood of benefit compared with supportive care.  I believe tPA helps some patients, but it should be a goal to dramatically reduce the costs and collateral damage associated with rushing to treat mimics and patients without a favorable balance of risks and benefits.  For these authors to recommend treatment in “carefully selected patients” and “shared decision-making”, more guidance should be provided – and absent the evidence to support such guidance, they should be calling for more trials!

The comment period is open until March 13, 2015.

“Clinical Policy: Use of Intravenous tPA for the Management of Acute Ischemic Stroke in the Emergency Department DRAFT”

Addendum 01/18/2015:
The SAEM EBM interest group is compiling comments on the evidence for feedback to the SAEM board of directors.  These are my additional comments after having had additional time to digest:

  • I agree with sICH as a Level A recommendation.  Both RCTs and observational registries tend to support such a recommendation.  Whether the pooled risk estimates are usable in knowledge translation to individual patients is less clear.  The risk of sICH is highly variable depending on individual patient substrate.  There are several risk stratification instruments described in the literature, but none are specifically recommended/endorsed/prospectively validated in large populations.
  • It is uncertain regarding the NINDS data whether their intention is to present pooled Part 1 and Part 2.  The prior clinical policy used only Part 2 for their NNT calculation, giving rise to an NNT of 8 instead of 6.  It appears they are pooling the data from both parts here.  Either is fine as long as it’s explicitly stated – the primary outcome differed, but the enrollment and eligibility should have been the same.
  • ECASS seems to be missing from their evidentiary table.  The ECASS 3-hour cohort data is available as a secondary analysis.  However, such would probably be Class III data of no real consequence for the recommendation.
  • Level B is probably an acceptable level of recommendation for tPA within the 0-3 hour window.  “Moderate clinical certainty” is reasonable, mostly on the strength of the Class III data.  However, the “systems in place to safely administer the medication” is not clearly addressed in the text.  Most of the published clinical trial and observational evidence involves acute evaluation by stroke neurology.  Does the primary stroke center certification practically replicate the conditions in which patients were enrolled in these trials/registries?  Perhaps this should be split out into a separate recommendation regarding the required setting for safe/timely/accurate administration.
  • Level B is difficult to justify for the 3 to 4.5 hour time window.  There is Class II evidence from ECASS III (downgraded due to potential for bias) demonstrating a small benefit.  The authors then cite Class III trial evidence from IST-3 and ATLANTIS in which no benefit was demonstrated.  Then, they cite the individual patient meta-analysis having similar effect size to ECASS III – because many of the patients in that subgroup come from ECASS III.  Basically, there’s only a single piece of Class II evidence and then inconsistent Class III evidence, which doesn’t meet criteria state for a Level B recommendation (1 or more Class of Evidence II studies or strong consensus of Class of Evidence III studies).  
  • With both Level B recommendations, the authors also reference “carefully selected” patients, but do not cite evidentiary basis regarding how to select said patients other than listing the enrollment criteria of trials.  If the “careful selection” is strict NINDS or ECASS III criteria, this should be explicitly stated in the recommendation.
  • The Level C recommendations to have shared decision-making with patients and surrogates ought to be obvious standard medical practice, but I suppose it bears repeating given the publications regarding implied consent for tPA.  They mention two publications regarding review and development of such tools, but there is no evidence supporting their efficacy or effectiveness in use.  Frankly, calling them a starting point in such a heterogenous population is along the lines of the broken clock that’s right twice a day.  I would rather say their dependence on group-level data minimizes their practical utility, and clinician expertise will be the best tool for individual patient risk assessment.

Feel free to add your comment and I will incorporate them into my feedback to SAEM.