Questioning the Benefit of Non-Invasive Testing for Chest Pain

Welcome to the fascinating world of instrumental variable analysis!

This is a retrospective cohort analysis of a large insurance claims database attempting to glean insight into the value of non-invasive testing for patients presenting to the Emergency Department with chest pain. Previous version of the American Heart Association guidelines for the evaluation of so-called “low risk” chest pain have encouraged patients to undergo some sort of objective testing with 72 hours of initial evaluation. These recommendations have waned in more recent iterations of the guideline, but many settings still routinely recommend admission and observation following an episode of chest pain.

These authors used a cohort of 926,633 unique admissions for chest pain and analyzed them to evaluate any downstream effects on subsequent morbidity and resource utilization.  As part of this analysis, they also split the cohort into two groups for comparison based on the day of the week of presentation – hence the “instrumental variable” for the instrumental variable analysis performed alongside their multivariate analysis. The authors made assumptions that individual patient characteristics would be unrelated to the day of presentation, but that downstream test frequency would. The authors then use this difference in test frequency to thread the eye of the needle as a pseudo-randomization component to aid in comparison.

There were 571,988 patients presenting on a weekday, 18.1% and 26.1% of which underwent some non-invasive testing within 2 and 30 days of an ED visit, respectively. Then, there were 354,645 patients presenting on a weekend, with rates of testing 12.3% and 21.3%. There were obvious baseline differences between those undergoing testing and those who did not, and those were controlled for using multivariate techniques as well as the aforementioned instrument variable analysis.

Looking at clinical outcomes – coronary revascularization and acute MI at one year – there were mixed results: definitely more revascularization procedures associated with exposure to non-invasive testing, no increase in downstream diagnosis of AMI. The trend, if any, is actually towards increased diagnoses of AMI. The absolute numbers are quite small, on the order of a handful of extra AMIs per 1,000 patients per year, and may reflect either the complications resulting from stenting or a propensity to receive different clinical diagnoses for similar presentations after receiving a coronary stent.  Or, owing to the nature of the analysis, the trend may simply be noise.

The level of evidence here is not high, considering its retrospective nature and dependence on statistical adjustments.  It also cannot determine whether there are longer-term consequences or benefits beyond its one-year follow-up time-frame. Its primary value is in the context of the larger body of evidence.  At the least, it suggests we have equipoise to examine which, if any, patients ought to be referred for routine follow-up – or whether the role of the ED should be limited to ruling out an acute coronary syndrome, and the downstream medical ecosystem is the most appropriate venue for determining further testing when indicated.

“Cardiovascular Testing and Clinical Outcomes in Emergency Department Patients With Chest Pain”

http://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2633257

Ottawa, the Land of Rules

I’ve been to Canada, but I’ve never been to Ottawa. I suppose, as the capital of Canada, it makes sense they’d be enamored with rules and rule-making. Regardless, it still seems they have a disproportionate burden of rules, for better or worse.

This latest publication describes the “Ottawa Chest Pain Cardiac Monitoring Rule”, which aims to diminish resource utilization in the setting of chest pain in the Emergency Department. These authors posit the majority of chest pain patients presenting to the ED are placed on cardiac monitoring in the interests of detecting a life-threatening malignant arrhythmia, despite such being a rare occurrence. Furthermore, the literature regarding alert fatigue demonstrates greater than 99% of monitor alarms are erroneous and typically ignored.

Using a 796 patients sample of chest pain patients receiving cardiac monitoring, these authors validate their previously described rule for avoiding cardiac monitoring: chest pain free and normal or non-specific ECG changes. In this sample, 284 patients met these criteria, and none of them suffered an arrhythmia requiring intervention.

While this represents 100% sensitivity for their rule, as a resource utilization intervention, there is obviously room for improvement. Of patients not meeting their rule, only 2.9% of this remainder suffered an arrhythmia – mostly just atrial fibrillation requiring pharmacologic rate or rhythm control. These criteria probably ought be considered just a minimum standard, and there is plenty of room for additional exclusion.

Anecdotally, not only do most of our chest pain patients in my practice not receive monitoring – many receive their entire work-up in the waiting room!

“Prospective validation of a clinical decision rule to identify patients presenting to the emergency department with chest pain who can safely be removed from cardiac monitoring”
http://www.cmaj.ca/content/189/4/E139.full

A qSOFA Trifecta

There’s a new sepsis in town – although, by “new” it’s not very anymore. We’re supposedly all-in on Sepsis-3, which in theory is superior to the old sepsis.

One of the most prominent and controversial aspects of the sepsis reimagining is the discarding of the flawed Systemic Inflammatory Response Syndrome criteria and its replacement with the Quick Sequential Organ Failure Assessment. In theory, qSOFA replaces the non-specific items from SIRS with physiologic variables more closely related to organ failure. However, qSOFA was never prospectively validated or compared prior to its introduction.

These three articles give us a little more insight – and, as many have voiced concern already, it appears we’ve just replaced one flawed agent with another.

The first article, from JAMA, describes the performance of qSOFA against SIRS and a 2-point increase in the full SOFA score in an ICU population. This retrospective analysis of 184,875 patients across 15 years of registry data from 182 ICUs in Australia and New Zealand showed very little difference between SIRS and qSOFA with regard to predicting in-hospital mortality. Both screening tools were also far inferior to the full SOFA score – although, in practical terms, the differences in adjusted AUC were only between ~0.69 for SIRS and qSOFA and 0.76 for SOFA. As prognostic tools, then, none of these are fantastic – and, unfortunately, qSOFA did not seem to offer any value over SIRS.

The second article, also from JAMA, is some of the first prospective data regarding qSOFA in the Emergency Department. This sample is 879 patients with suspected infection, followed for in-hospital mortality or ICU admission. The big news from this article is the AUC for qSOFA of 0.80 compared with the 0.65 for SIRS or “severe sepsis”, as defined by SIRS plus a lactate greater than 2mmol/L. However, at a cut-off of 2 or more for qSOFA, the advertised cut-off for “high risk”, the sensitivity and specificity were 70% and 79% respectively.

Finally, a third article, from Annals of Emergency Medicine, also evaluates the performance characteristics of qSOFA in an Emergency Department population. This retrospective evaluation describes the performance of qSOFA at predicting admission and mortality, but differs from the JAMA article by applying qSOFA to a cross-section of mostly high-acuity visits, both with and without suspected infection. Based on a sample of 22,350 ED visits, they found similar sensitivity and specificity of a qSOFA score of 2 or greater for predicting mortality, 71% and 74%, respectively. Performance was not meaningfully different between those with and without infection.

It seems pretty clear, then, this score doesn’t hold a lot of value. SIRS, obviously, has its well-documented flaws. qSOFA seems to have better discriminatory value with regards to the AUC, but its performance at the cut-off level of 2 puts it right in a no-man’s land of clinical utility. It is not sensitive enough to rely upon to capture all patients at high-risk for deterioration – but, then, its specificity is also poor enough using it to screen the general ED population will still result in a flood of false positives.

So, unfortunately, these criteria are probably a failed paradigm perpetuating all the same administrative headaches as the previous approach to sepsis – better than SIRS, but still not good enough. We should be pursuing more robust decision-support built-in to the EHR, not attempting to reinvent overly-simplified instruments without usable discriminatory value.

“Prognostic Accuracy of the SOFA Score, SIRS Criteria, and qSOFA Score for In-Hospital Mortality Among Adults With Suspected Infection Admitted to the Intensive Care Unit”

http://jamanetwork.com/journals/jama/article-abstract/2598267

“Prognostic Accuracy of Sepsis-3 Criteria for In-Hospital Mortality Among Patients With Suspected Infection Presenting to the Emergency Department”

http://jamanetwork.com/journals/jama/fullarticle/2598268

“Quick SOFA Scores Predict Mortality in Adult Emergency Department Patients With and Without Suspected Infection”

http://www.annemergmed.com/article/S0196-0644(16)31219-7/fulltext

Chest X-Ray Utility in Syncope Lost in Translation

Again, straight out of the ACEP Daily News briefing: “Patients Presenting To ED With Complaints Of Syncope Should Still Undergo Routine Chest X-Rays, Research Suggests.”

This accurately reports the lead of the linked lay medical press article: “ED Patients With Syncope Should Undergo Chest X-Rays

But, it does not accurately reflect the authors’ discussion or conclusions regarding the utility of chest x-ray in syncope.

This is a retrospective evaluation of patients presenting with syncope and having a chest x-ray between 2003 and 2006 – a secondary analysis of the “Boston Syncope Criteria” study. There were 575 patients included in their analysis, 116 of whom had a defined adverse event within 30 days. Of the patients with positive findings on CXR, 15 of those 18 went on to have an adverse event – and I presume this association led to the perpetuation of this headline.

However, in the greater context: only 18 patients out of 575 had abnormal CXR findings, and even the vast majority of patients with adverse events had normal normal CXR findings. Then, an obvious selection bias should be clear with regard to obtaining CXR in those patients with the appropriate clinical indications – such as a suspicion for CHF or pneumonia. Patients go on to have adverse events because of the morbidity associated with concomitant clinical syndromes, of which the findings on CXR are only one small part of their evaluation.

In short, no, CXR is so low-yield it need not be performed anywhere remotely near routinely in syncope. It may be performed to evaluate a specific presenting symptom related to a syncopal event, but, if anything, these data should indicate it ought be performed less frequently.

“Utility of Chest Radiography in Emergency Department Patients Presenting with Syncope”
http://westjem.com/original-research/utility-of-chest-radiography-in-emergency-department-patients-presenting-with-syncope.html

Taking First-Time Seizures Seriously

Last week, I covered a disastrous prevalence study that almost certainly over-estimates the frequency of pulmonary embolism in syncope. Today, something similar – the frequency of epilepsy in patients presenting to the Emergency Department with first-time seizure.

The most recent American College of Emergency Physicians position statement regarding first-time seizures is fairly clear: first-time seizures need not be started on anti-epileptic therapy. The thinking goes, of course, that few patients would be ultimately diagnosed with epilepsy, and most of those initiated on anti-epileptics would be exposed only to their adverse effects without any potential for benefit.

This small study tries to better clarify the frequency of an epilepsy diagnosis. At a single center, during convenience business hours Monday through Friday, consecutive patients with first-time seizure of uncertain etiology were screened for enrollment. During their enrollment period, they were able to capture 71 patients for whom they were able to complete an EEG in the Emergency Department. Of these, 15 (21%) patients were diagnosed with epilepsy based on their ED EEG. All of these patients were then initiated on an anti-epileptic, most commonly levetiracetam. Anti-epileptic therapy was additionally started on two patients with abnormal EEGs and structural brain disease on imaging, one of whom was able to be contacted in follow-up with a repeat EEG showing epilepsy. These authors use these data to suggest potential benefit for EEG performed in the ED.

This is a fairly reasonable conclusion, although the level of evidence from this single study is weak. This is probably another example of the ED filling a gap in outpatient follow-up; it would almost certainly be perfectly safe to discharge these patients without investigation or initiation of therapy if an ambulatory EEG could be arranged within a few days. Further, larger-scale evaluation of the value of an ED EEG would be needed to modify our current approach.

“The First-Time Seizure Emergency Department Electroencephalogram Study.”
https://www.ncbi.nlm.nih.gov/pubmed/27745763

The Impending Pulmonary Embolism Apocalypse

After many years of intense effort, our work in recognizing overdiagnosis and over-treatment of pulmonary embolism has been paying off. With the PERC, with adherence to evidence-based guidelines, and with a responsible approach to resource utilization, it is reasonable to suggest we’re making headway into over-investigating this diagnosis.

Prepare for all that hard work to be obliterated.

This is a prospective study of patients admitted to the hospital for syncope, evaluating each in a systematic fashion for the diagnosis of PE. Consecutive admissions with first-time syncope, who were not currently anticoagulated, underwent risk-stratification using Wells score, D-dimer testing if indicated, and ultimately either CT pulmonary angiograms or V/Q scanning. The top-line result, the big scary number you’re likely seeing circulating the medical and lay news: “among 560 patients hospitalized for a first-time fainting episode, one in six had a pulmonary embolism.”

Prepare for perpetual arguments with the admitting hospitalist for the next several eternities: “Could you go ahead an get a CTPA? You know, 17% of patients with syncope have PE.”

I’d like to tell you they’re wrong, and this study is somehow flawed, and you’ll be able to easily refute their assertions. Unfortunately, yes, they are wrong, and this study is flawed – but it won’t make it any easier to prevent the inevitable downstream overuse of CT.

The primary issue here is the almost certain inappropriate generalization of these results to dissimilar clinical settings. During the study period, there were 2,584 patients presenting to the Emergency Department with a final diagnosis of syncope. Of these, 1,867 were deemed to have an obvious or non-serious alternative cause of syncope and were discharged home. Thus, less than a third of ED visits for syncope were admitted, and the admission cohort is quite old – with a median age for admitted patients of 80 (IQR 72-85). There is incomplete descriptive data given regarding their comorbidities, but the authors state admission criteria included “severe coexisting conditions” and “a high probability of cardiac syncope on the basis of the Evaluation of Guidelines in Syncope Study score.” In short, their admission cohort is almost certainly older and more chronically ill than many practice settings.

Then, there are some befuddling features presented that would serve to inflate their overall prevalence estimate. A full 40.2% of those diagnosed with pulmonary embolism had “Clinical signs of deep-vein thrombosis” in their lower extremities, while 45.4% were tachypneic and 33.0% were tachycardic. These clinical features raise important questions regarding the adequacy of the Emergency Department evaluation; if many of these patients with syncope had symptoms suggestive of PE, why wasn’t the diagnosis made in ED? If even only the patients with clinical signs of DVT were evaluated prior to admission, those imaging studies would have had a yield for PE of 65%, and the prevalence number seen in this study would drop from 17.3% to 10.3%. Further evaluation of either patients with tachypnea or tachycardia might have been similarly high-yield, and further reduced the prevalence of PE in admitted patients.

Lastly, any discussion regarding a prevalence study requires mention of the gold-standard for diagnosis. CTPA confirmed the diagnosis of PE in 72 patients in this study. Of these, 24 involved a segmental or sub-segmental pulmonary artery – vessels in which false-positive results typically represent between one-quarter to one-half. Then, V/Q scanning was used to confirm the diagnosis of PE in 24 patients. Of these, the perfusion defect represented between 1% and 25% of the area of both lungs in 12 patients. I am not familiar with the rate of false-positives in the context of small perfusion defects on V/Q, but, undoubtedly a handful of these would be as well.  Add this to the inadequate ED evaluation of these patients, and suddenly we’re looking at only a handful of true-positive occult PE in this elderly, chronically ill cohort with syncope.

My view of this study is that its purported take-home point regarding the prevalence of PE in syncope is grossly misleading, yet this “one in six” statistic is almost guaranteed to go viral among those on the other side of the admission fence.  This study should not change practice – but I fear it almost certainly will.

“Prevalence of Pulmonary Embolism among Patients Hospitalized for Syncope”

http://www.nejm.org/doi/full/10.1056/NEJMoa1602172

Stumbling Around Risks and Benefits

Practicing clinicians contain multitudes: the vastness of critical medical knowledge applicable to the nearly infinite permutaions of individual patients.  However, lost in the shuffle is apparently a grasp of the basic fundamentals necessary for shared decision-making: the risks, benefits, and harms of many common treatments.

This simple research letter describes a survey distributed to a convenience sample of residents and attending physicians at two academic medical centers. Physicians were asked to estimate the incidence of a variety of effects from common treatments, both positive and negative. A sample question and result:

treatment effect estimates
The green responses are those which fell into the correct range for the question. As you can see, in these two questions, hardly any physician surveyed guessed correctly.  This same pattern is repeated for the remaining questions – involving peptic ulcer prevention, cancer screening, and bleeding complications on aspirin and anticoagulants.

Obviously, only a quarter of participants were attending physicians – though no gross differences in performance were observed between various levels of experience. Then, some of the ranges are narrow with small magnitudes of effect between the “correct” and “incorrect” answers. Regardless, however, the general conclusion of this survey – that we’re not well-equipped to communicate many of the most common treatment effects – is probably valid.

“Physician Understanding and Ability to Communicate Harms and Benefits of Common Medical Treatments”
http://www.ncbi.nlm.nih.gov/pubmed/27571226

High Blood Pressure is Not a Crime

And you don’t need to be sent to “time out” – i.e., referred to the Emergency Department – solely because of it.

This is a retrospective, single-center report regarding the incidence of adverse events in patients found to have “hypertensive urgency” in the outpatient setting.  This was defined formally as any systolic blood pressure measurement ≥180 mmHg or diastolic measurement ≥110 mmHg.  Their question of interest was, specifically, whether patients referred to the ED received clinically-important diagnosis (“major adverse cardiovascular events”), with a secondary interest in whether their blood pressure was under better control at future outpatient visits.

Over their five-year study period, there were 59,535 patient encounters meeting their criteria for “hypertensive urgency”.  Astoundingly, only 426 were referred to the Emergency Department.  Of those referred to the ED, 2 (0.5%) received a MACE diagnosis within 7 days, compared with 61 (0.1%) of the remaining 58,109.  By 6 months, MACE had equalized between the two populations – now 4 (0.9%) in the ED referral cohort compared with 492 (0.8%) in those sent home.  Hospital admission, obviously, was higher in those referred to the ED, but apparently conferred a small difference in blood pressure control in follow-up.

The authors go on to perform a propensity-matched comparison of the ED referrals to the sent home cohort, but this is largely uninsightful.  The more interesting observation is simply that these patients largely do quite well – and any adverse events probably happen at actuarial levels rather than having any specific relationship to the index event.

I appreciate how few patients were ultimately referred to the Emergency Department in this study; fewer than 1% is an inoffensive number.  That said, zero percent would be better.

“Characteristics and Outcomes of Patients Presenting With Hypertensive Urgency in the Office Setting”
http://archinte.jamanetwork.com/article.aspx?articleid=2527389

Triaging Large Artery Occlusions

Endovascular intervention for acute stroke can be quite useful – in appropriately selected patients.  However, few centers are capable of such interventions, and the technology to properly angiographically evaluated for large-artery occlusions is not available in all settings.  Thus, it is just as critical for patients to be clinically screened in some fashion to prevent over-utilization of scarce resources.

These authors retrospectively reviewed 1,004 acute stroke patients admitted to their facility since 2008, 328 of which had large-vessel occlusions: ICA, M1, or basilar artery.  They calculated the accuracy, sensitivity, and specificity of multiple different potential clinical scoring systems, cut-offs.  Unfortunately, every score made some trade-off – either in the rate of false-negative results excluding patients from potential intervention, or in the rate of false-positive results serving to simply subject every patient to advanced imaging.  The maximum accuracy of all their various scores topped around around 78%.

The authors’ conclusions are reasonable, if a little limited.  They feel every patient presenting with an acute stroke within 6 hours of symptom onset should undergo vascular imaging.  These are both reasonable, but ignore one of the major uses for simple clinical scoring systems: prehospital triage.  Admitting none of these are perfect, _something_ must be put to use – and, probably, given the current bandwidth for endovascular intervention, something with the highest specificity.

For what it’s worth, we use RACE to triage for CT perfusion, but CPSSS, ROSIER, or just NIHSS cut-offs around 10 would all be fair choices.

“Clinical Scales Do Not Reliably Identify Acute Ischemic Stroke Patients With Large-Artery Occlusion”
https://www.ncbi.nlm.nih.gov/pubmed/27125526

The Biomarker for Burnout

I’m tired.  You’re tired.  We’re all tired.  Importantly – performance suffers with exhaustion, unhealthy behaviors at work increase, and cognitive errors at work rise.  Burnout.

And now there might be a test for it.

This is a small study of resident trainees in Turkey, correlating the levels of neurotrophic factor S100 calcium binding protein B with symptoms of Burnout Syndrome – emotional exhaustion, depersonalization, and personal accomplishment.  S100B is a marker of glial activation and brain injury, and seems to fluctuate with stress and depression, although the associations have not been shown to be reliable.

Each resident trainee was asked to complete a questionnaire regarding burnout prior to, and following, a night shift, along with concomitant blood draw.  Unfortunately, the results are primarily grim, and not on account of the primary outcome: 37 of 48 participating residents scored in the severe depression category on the burnout questionnaire.  The remaining 11 scored in the moderate range.

Looking at the actual purpose of the study, however, they did find S100B levels were significantly different between severe and moderate depression, even accounting for the small sample.  The pre- and post-night shift levels were not appreciably different.  Overall, S100B seemed to correlate best with the overall burnout score, in particular the subscore for emotional exhaustion.

It’s a little hard to interpret these data, or envision how they might be applied in a real-world situation.  It does seem a reasonable biomarker to pursue as an objective measure of the stresses of training, and, frankly, it may be the on-shift changes were not detected as a result of most residents already exhibiting features of high stress and burnout even before starting their night.  Then, even assuming S100B were proven valid, the “gold standard” in this case – the burnout inventory – is probably less expensive and certainly less invasive to deploy.

I am not certain the way forward for this line of burnout biomarker research, but it is rather interesting.

“Serum S100B as a surrogate biomarker in the diagnoses of burnout and depression in emergency medicine residents.”
https://www.ncbi.nlm.nih.gov/pubmed/27018399