Dexamethasone Dilemma

Look! On Twitter! Two highly-respected medical minds taking the same trial publication and producing two, very different responses:

The controversy stems from a small study examining the relatively common practice of treating pharyngitis with an oral steroid – usually dexamethasone – for its anti-inflammatory effect. Most pharyngitis does not require antibiotics, and physicians understandably prefer to try something to provide relief from suffering.

This study enrolled 576 patients in a randomized, placebo-controlled, double-blind trial in an outpatient general practice setting. Patients were provided either 10mg of oral dexamethasone or an identical lactose placebo. Patients could only enter into the trial if immediate antibiotics were not prescribed, but physicians were allowed to give a “delayed” prescription for failure to improve.

The trial is statistically negative for the primary outcome, complete resolution of sore throat at 24 hours. Of those assigned to dexamethasone, 22.6% had complete symptom resolution at 24 hours, compared with 17.7% of placebo, an absolute risk difference of 4.7% (-1.8 to 11.2)[sic]. The effect size is slightly larger at 48 hours, 8.7%, which does reach statistical significance – and thus the NNT noted above by Ian Stiell. Nearly all the other secondary outcomes – resource utilization, subsequent antibiotic use, use of pain relief – favor dexamethasone, but generally range in effect size between 1-4%.

Does the failure to meet statistical significance for the primary outcome refute this therapy as effective? Not hardly – but it certainly calls into question whether the difference is reproducible or clinically meaningful. Plug these data into Ioannidis’ framework regarding the reliability of research findings, and we see this is precisely the sort of work where both conclusions are reasonable. Is there a signal for a symptomatic benefit? Absolutely. The strength of the signal, however, is not strong enough to overcome whatever pre-study odds you placed on the treatment being successful. If, like many, you feel this is a treatment likely beneficial, this study appears confirmatory. If, like many, you feel systemic steroids for symptomatic pharyngitis is inane, this study does little to change your view of the inadequate risk/benefit ratio.

Another possible interpretation of these data is the possibility of variable effects within subgroups, where the entire small effect size seen in these data results from a more substantial effect size in some fraction of the cohort. For example, the mean duration of symptoms was ~3.9 days, with a SD of ~1.7 days. Could the recency of symptoms be associated with likelihood of benefit? Any secondary analyses such as these, particularly in a small trial like this, would only serve as fodder for future investigations.

I have seen, however, other folks using this as an opportunity to link to the recent BMJ publication regarding adverse events and corticosteroid exposures. Without delving into that publication in detail, it would be a mistake to generalize those data to this population. That said, systemic corticosteroids are certainly not harmless. These authors rather ludicrously state “Short courses of oral steroids have been shown to be safe, in the absence of contraindications” – justified by a citation from 1982.

The final answer is somewhere in between our two friends above. Dexamethasone will help some patients with symptom relief from pharyngitis, and it will harm some.  Teasing out a prediction of the optimal risk/benefit for a patient is substantially challenging – and wide practice variation is justifiable from these data, as long as it is acknowledged the uncertainty in the evidence base.

“Effect of Oral Dexamethasone Without Immediate Antibiotics vs Placebo on Acute Sore Throat in Adults”
http://jamanetwork.com/journals/jama/fullarticle/2618622

PECARN, CATCH, CHALICE … or None of the Above?

The decision instrument used to determine the need for neuroimaging in minor head trauma essentially a question of location. If you’re in the U.S., the guidelines feature PECARN. In Canada, CATCH. In the U.K., CHALICE. But, there’s a whole big world out there – what ought they use?

This is a prospective observational study from two countries out in that big remainder of the world – Australia and New Zealand. Over approximately 3.5 years, these authors enrolled patients with non-trivial mild head injuries (GCS 13-15) and tabulated various rule criteria and outcomes. Each rule has slightly different entry criteria and purpose, but over the course of the study, 20,317 patients were gathered for their comparative analysis.

And, the winner … is Australian and New Zealand general practice. Of these 20,000 patients included, only 2,106 (10%) underwent CT. It is hard to read between the lines and determine how many of the injuries included in this analysis were missed on the initial presentation, but if rate of neuroimaging is the simplest criteria for winning, there’s no competition. Applying CHALICE to their analysis cohort would have increased their CT rate to approximately 22%, and CATCH would raise the rate to 30.2%. Application of PECARN would place 46% of the cohort into CT vs. observation – an uncertain range, but certainly higher than 10%.

Regardless, in their stated comparison, the true winner depends on the value-weighting of sensitivity and resource utilization. PECARN approached 100% or 99% sensitivity, missing only 1 patient with clinically important traumatic brain injury out of ~10,000. Contrawise, CATCH and CHALICE missed 13 and 12 out of ~13,000 and ~14,000, respectively. Most of these did not undergo neurosurgical intervention, but a couple missed by CHALICE and CATCH would. However, as noted above, PECARN is probably substantially less specific than both CATCH and CHALICE, which has relatively profound effect on utilization for a low-frequency outcome.

Ultimately, however, any of these decision instruments is usable – as a supplement to your clinical reasoning. Each of these rules simplifies a complex decision into one less so, with all its inherent weaknesses. Fewer than 1% of children with mild head injury need neurosurgical intervention and these are certainly rarely missed by any typical practice. In settings with high CT utilization rates, any one of these instruments will likely prove beneficial. In Australia and New Zealand – as well as many other places around the world – potentially not so much.  This is probably a fine example of the need to compare decision instruments to clinician gestalt.

“Accuracy of PECARN, CATCH, and CHALICE head injury decision rules in children: a prospective cohort study”

http://www.thelancet.com/journals/lancet/article/PIIS0140-6736(17)30555-X/abstract

Leave the Blood at Home?

In severely injured multi-system trauma patients, the gold standard for volume replacement is blood – in a relatively balanced ratio between PRBCs, plasma, and platelets. Match this need for blood with the conceptual “golden hour” for acute resuscitation, and it is reasonable to hypothesize there might be added benefit to providing blood products as early as feasible – including during emergency transport. Many of the most critically injured patients with time delays to a trauma center require aeromedical evacuation, so blood products on the helicopter may be ideal.

Sounds good, but the outcomes here are unfortunately not.

This is an observational report from nine trauma systems utilizing aeromedical transport, five of whose helicopters carried blood products and four whose carried only crystalloid. There were 25,118 patients during the study period, 2,341 of whom were transported by helicopter, and 1,058 of whom met “high risk” criteria. Approximately half of these were transported with blood products available, and 142 (24%) of those received transfusion.

Unfortunately, there were vast differences and great heterogeneity between the groups with and without blood products available, including GCS, ISS, and “prehospital lifesaving interventions”. There were similarly profound differences between those receiving blood and those not. The unadjusted mortality outcomes generally followed lower GCS and worse ISS, as one would expect. The authors then attempted a propensity-match analysis to dredge some signal from their data, but only 10% of their cohort could be parsed by their matching algorithm. Owing to only this small sample and the statistical techniques, no reliable difference in outcomes can be demonstrated.

The authors ultimately suggest a multicenter randomized trial will be required to adequately test whether the availability of blood has any mortality benefit. This is clearly the best strategy to improve our answer to this question, although it is prudent to recall non-obvious effect sizes in observational data potentially suggest only a very small magnitude of beneficial effect, if any. This must then be weighed against the important wastage of limited transfusion resources, which would require a non-trivial improvement in outcomes.

“Multicenter Observational Prehospital Resuscitation on Helicopter Study (PROHS)”

https://www.ncbi.nlm.nih.gov/pubmed/28383476

Just the Cost of Doing Business

Good news, everyone!

In the past two decades, for virtually every specialty, the number of paid medical malpractice claims has decreased. Overall, for all specialties, the rate of payment has been halved, compared with the 1992-1996 timeframe. Neurosurgery, unfortunately, is still the “winner”, followed by plastic surgery, thoracic surgery, and obstetrics. The lowest rates were seen in psychiatry and pediatrics. Emergency medicine sits right in the middle, with 18.8 paid claims per 1,000 physician years.

The bad news, unfortunately, was that the claim amounts – including paid claims greater than $1 million – increased. Emergency Medicine paid claim amounts increased 26.1% to a mean of $314,052 in the most recent time period of analysis, an increase in line with the overall mean for all specialties. The largest jump in payout amounts was essentially a tie between dermatology, gastroenterology, pathology, and urology. Neurosurgery actually had one of the lowest payout increases – probably because they started from such lofty heights, already.

Types of malpractice alleged varied by specialty, with the expected variation between diagnostic error, surgical error, and treatment errors between the diagnostic and surgical specialties. Most (63.6%) of malpractice alleged in emergency medicine fell into alleged diagnostic error, while logically 73.3% of alleged error in plastic surgery fell under surgical error.

These data, from the National Provider Data Bank, only document payments made for written claims and do not include settlements or monies paid out by institutions. Whether these actually represent a friendlier environment for physicians, more aggressive approaches to settling claims, or a shifting of liability to corporate proxy is not clear. Regardless, even if it is a little of all three, the trend is probably moving in the right direction.

“Rates and Characteristics of Paid Malpractice Claims Among US Physicians by Specialty, 1992-2014”

http://jamanetwork.com/journals/jamainternalmedicine/article-abstract/2612118

Stem Cells for Stroke Redux

A few months ago, folks at Stanford were claiming miraculous recoveries after implanting stem cells directly into patients’ brains at the site of injury. An interesting concept, to be certain.

Now we have “stem cells lite”, or, at least, the slightly-fewer-holes-in-the-skull version – and it’s apparently just as miraculous.

This is a Phase 2 double-blinded dose-escalation study evaluating treatment with intravenous multipotent adult progenitor cells, with treatment initiated between 24 and 48 hours. Their trial design reflects the nature of a Phase 2 trial, with three cohorts, unbalanced allocation, and dosing differences between groups, but is otherwise fairly straightforward. Until you get to the primary outcome:

“The primary efficacy outcome was the multivariate global stroke recovery at day 90, which assesses global disability, neurological deficit, and activities of daily living and consists of mRS 2 or less; NIHSS total score improvement of 75% or more from baseline; and Barthel index of 95 or more in the multipotent adult progenitor cells treatment group, compared with the placebo treatment.”

Which is to say, they’ve conjured up their own unique black-box composite primary outcome – an outcome they changed midway through the trial.

Why would you need to change the primary efficacy outcome in 2014 for a study that started in 2011? The obvious implication is the results were unfavorable – and, the cursory review of their results table suggests this is a reasonable stance to take.

These authors screened 160 patients at several different sites for eligibility and ultimately randomized 129. Of these, three did not receive the allocated intervention – leaving the remainder for analysis. Patients in each group were generally similar based on NIHSS, time until infusion, and stroke interventions. Sticking to traditional outcomes measured by stroke trials, there was no difference between groups: mRS ≤2 in 37% of the intervention group and 36% of the placebo.  However:

“exploratory analyses suggested an increase in excellent outcome in the multipotent adult progenitor cells arms in the ITT population, and a beneficial clinical effect on long-term 1 year disability.“

This “excellent” outcome is the product of the midstream outcome change combined with their post-hoc data dredging for a feasible positive finding – a combination of patients with mRS ≤1, a NIHSS ≤1, and a Barthel Index ≥95. Then, the bulk of their analysis is further restricted to one year outcomes of those who received their stem cells within 36 hours from stroke onset. With such an obvious “beneficial clinical effect”, is there any question regarding the role of the funding source?

“The funder of the study was involved in study design and in data interpretation. All data collection and analysis were overseen by Medpace. One employee of the funder (RWM) was represented on the writing committee.“

and:

“DCH received grants from Athersys, payments to his university from Medpace for patient enrolment, has a patent on the MultiStem cells through his university and has received licensing revenue through his university. LRW received grants from SanBio and Athersys, and personal fees from SanBio. GAF is a consultant for Athersys; received personal fees from Medpace; and payment from Medpace to his institution for study costs. SS received grants from Athersys. SIS received grants from Athersys, and consulting fees that were paid to the institution from Mesoblast, Aldagen, and Celgene. CAS received grants from Athersys. DC received grants from Athersys.”

The likelihood these results are valid, reproducible, and have a clinically meaningful effect size is nearly zero – but that certainly won’t stop them from throwing good money after bad.

“Safety and efficacy of multipotent adult progenitor cells in acute ischaemic stroke (MASTERS): a randomised, double-blind, placebo-controlled, phase 2 trial”
https://www.ncbi.nlm.nih.gov/pubmed/28320635

D-Dimer, It’s Not Just a Cut-Off

It’s certainly simpler to have a world where everything is black or white, right or wrong, positive or negative. Once upon a time, positive cardiac biomarkers meant acute coronary syndrome – now we have more information and shades of grey in between. The D-dimer, bless its heart, is probably like that, too.

This is a simple study that pooled patients from five pulmonary embolism studies to evaluate the diagnostic performance characteristics of the D-dimer assay. Conventional usage is simply to deploy the test as a dichotomous rule-out – a value below our set sensitivity threshold obviates further testing, while above consigns us to the bitter radiologic conclusion. These authors, perhaps anticipating a more sophisticated diagnostic strategy, go about trying to calculate interval likelihood ratios for the test.

Using over 6,000 patients as their substrate for analysis, these authors determine the various likelihood ratios for D-dimer levels between 250 ng/mL and greater than 5,000 ng/mL, and identify intervals of gradually increasing width, starting at 250 and building up to 2,500. Based on logistic regression modeling, the fitted and approximate iLR range from 0.0625 for those with D-dimer less 250 ng/mL and increasing to 8 for levels greater than 5,000. Interestingly, a D-dimer between 1,000 and 1,499 had an iLR of roughly 1 – meaning those values basically have no effect on the post-test likelihood of PE.

The general implication of these data would be to inform more precise accounting of the risk for PE involving the decision to proceed to CTPA. That said, with our generally inexact tools for otherwise estimating pretest likelihood of disease (Wells, Geneva, gestalt), these data are probably not quite ready for clinical use. I expect further research to develop more sophisticated individual risk prediction models, for which these likelihood ratios may be of value.

“D-Dimer Interval Likelihood Ratios for Pulmonary Embolism”
https://www.ncbi.nlm.nih.gov/pubmed/28370759

February EM Lit of Note Audio Digest

The audio digest for the EMLoN posts from February.

Better late than never!  This is what happens when you make new babies – delays and more delays.

Update April 2nd: Link to current episode fixed!  Sorry!

All Aboard the tPA Hype Bus

Indiscriminate use of tPA in those with undifferentiated stroke is a low-value proposition – even if you find the evidence reliable. The utility of tPA for stroke depends on anatomy, time, and tissue status – information the traditional non-contrast head CT does not usually provide. Unfortunately, one of the latest “innovations” in stroke care is simply to do this useless test faster – in a bus, down by the river.

This is the PHAST project out of Cleveland, which, like similar efforts in Berlin, Chattanooga, and Houston, puts a CT scan machine in an oversized ambulance. Many of the initial phases of these projects included a stroke physician physically in the vehicle – but this, as you would expect, takes advantage of telemedicine technology to provide consultation from afar.

The stated hypothesis of this project is “that the MSTU will allow significant reductions in time to evaluation and treatment of patients when compared to a traditional ambulance model in an American urban environment”, which is just mind-numbingly infantile. Of course, pre-hospital administration will be faster than in-house thrombolysis – the interesting data would be with regard to safety and misdiagnosis.

This report is of the first 100 patients evaluated – generated by 317 system alerts. Of these, 33 were given a preliminary diagnosis of probable stroke, 30 possible stroke, 4 transient ischemic attacks, 5 intracerebral hemorrhages, and 28 non-cerebrovascular. Of the 33 probable strokes, 16 received thrombolysis – and, by most of their various metrics, care was accelerated by 20-40 minutes. And, then, no outcomes, safety, or follow-up data is presented – apparently we are simply supposed to operate under the assumption this resource outlay and rush to provide the substrate for potential tPA administration is obviously prudent and effective care.

Probably the only interesting tidbit from this paper was with regard to one of the cases of ICH diagnosed by CT in the prehospital setting. One patient was identified as taking anticoagulation, and prothrombin concentrate complexes were initiated in the pre-hospital setting. The timeliness of anticoagulation reversal is almost certainly beneficial, although the magnitude of effect for the few minutes saved is uncertain.

“Reduction in time to treatment in prehospital telemedicine evaluation and thrombolysis”

http://www.neurology.org/content/early/2017/03/08/WNL.0000000000003786.abstract

Vitamin C for Sepsis

This is just a quick post in response to a tweet – and hype-machine press-release – making the rounds today.

This covers a before-and-after study regarding a single-center practice change in an intensive care unit where their approach to severe sepsis was altered to a protocol including intravenous high-dose vitamin C (1.5g q6), intravenous thiamine (200mg q12), and hydrocortisone (50mg q6). Essentially, this institution hypothesized this combination might have beneficial physiologic effects and, after witnessing initial anecdotal improvement, switched to this aforementioned protocol. This report describes their outcomes in the context of comparing the treatment group to similar patients treated in the seven months prior.

In-hospital mortality for patients treated on the new protocol was 8.5%, whereas previously treated patients were subject to 40.4% mortality. Vasopressor use and acute kidney injury was similarly curtailed in the treatment group. That said, these miraculous findings – as they are exhorted in the EVMS press release – can only be considered as worthy of further study at this point. With a mere 47 patients in both treatment groups, a non-randomized, before-and-after design, and other susceptibilities to bias, these findings must be prospectively confirmed before adoption. When considered in the context of Ioannidis’ “Why Most Published Research Findings Are False”, caution is certainly advised.

I sincerely hope prospective, external validation will yield similar findings – but will likewise not be surprised if they do not.

“Hydrocortisone, Vitamin C and Thiamine for the Treatment of Severe Sepsis and Septic Shock: A Retrospective Before-After Study”
https://www.ncbi.nlm.nih.gov/pubmed/27940189

Making Urine Cultures Great Again

As this blog covered earlier this month, the diagnosis of urinary tract infection – as common and pervasive as it might be – is still fraught with diagnostic uncertainty and inconclusive likelihood ratios. In practice, clinicians combine pretest likelihood, subjective symptoms, and the urinalysis to make a decision regarding treatment – and invariably err on the side of over-treatment.

This is an interesting study taking place in the Nationwide Children’s Hospital network regarding their use of urine cultures. In retrospect, these authors noted only half of patients initially diagnosed with UTI had the diagnosis ultimately confirmed by contemporaneous urine culture. Their intervention, then, in order to reduce harm from adverse effects of antibiotics, was to contact patients following a negative urine culture result and request antibiotics be stopped.

This tied into an entire quality-improvement procedure simply to use the electronic health record to accurately follow-up the urine cultures, but over the course of the intervention, 910 patients met inclusion criteria. These patients were prescribed a total of 8,648 days of antibiotics, and the intervention obviated 3,429 (40%) of those days. Owing to increasing uptake of the study intervention by clinicians, the rate of antibiotic obviation had reached 61% by the end of the study period.

There are some obvious flaws in this sort of retrospective reporting on a QI intervention, as there was no reliable follow-up of patients included. The authors report no patients were subsequently diagnosed with a UTI within 14 days of being contacted, but this is based on only 46 patients who subsequently sought care within their healthcare system within 14 days, and not any comprehensive follow-up contact. There is no verification or antibiotics actually being discontinued following contact. Then, finally, antibiotic-free days are only a surrogate for a reduction the suspected adverse events associated with their administration.

All that said, this probably represents reasonable practice. Considering the immense frequency with which urine cultures are sent and antibiotics prescribed for dysuria, the magnitude of effect witnessed here suggests a potentially huge decrease in exposure to unnecessary antibiotics.

“Urine Culture Follow-up and Antimicrobial Stewardship in a Pediatric Urgent Care Network”
http://pediatrics.aappublications.org/content/early/2017/03/14/peds.2016-2103