Another Taste of the Future

Putting my Emergency Informatics hat back on for a day, I’d like to highlight another piece of work that brings us, yet again, another step closer to being replaced by computers.

Or, at the minimum, being highly augmented by computers.

There are multitudinous clinical decision instruments available to supplement physician decision-making.  However, the general unifying element of most instruments is the necessary requirement of physician input.  This interruption of clinical flow reduces acceptability of use, and impedes knowledge translation through the use of these tools.

However, since most clinicians are utilizing Electronic Health Records, we’re already entering the information required for most decision instruments into the patient record.  Usually, this is a combination of structured (click click click) and unstructured (type type type) data.  Structured data is easy for clinical calculators to work with, but has none of the richness communicated by freely typed narrative.  Therefore, clinicians much prefer to utilize typed narrative, at the expense of EHR data quality.

This small experiment out of Cincinnati implemented a natural-language processing and machine-learning automated method to collect information from the EHR.  Structured and unstructured data from 2,100 pediatric patients with abdominal pain were analyzed to extract the elements to calculate the Pediatric Appendicitis Score.  Appropriateness of the Pediatric Appendicitis Score aside, their method performed reasonably well.  It picked up about 87% of the elements of the Score from the record, and was correct when doing so about 86%, as well.  However, this was performed retrospectively – and the authors state this processing would still be substantially delayed by hours following the initial encounter.

So, we’re not quite yet at the point where a parallel process monitors system input and provides real-time diagnostic guidance – but, clearly, this is a window into the future.  The theory:  if an automated process could extract the data required to calculate the score, physicians might be more likely to integrate the score into their practice – and thusly lead to higher quality care through more accurate risk-stratification.

I, for one, welcome our new computer overlords.

“Developing and evaluating an automated appendicitis risk stratification algorithm for pediatric patients in the emergency department”

Foam, Actually

I chastise JAMA on occasion, but, any article that starts like this is the mark of a truly great academic publisher:

“The lights are low and the music volume is high.  As arms and legs sway on a packed dance floor, streams of soapy suds blow down from the ceiling….”

No, it’s not a ‘tween reviewing an illegal high during a rave, it’s a actually CDC surveillance of a spike in eye injuries resulting from “foam parties”.  This write-up details an investigation in Collier County, Florida, in which more than 40 patients sought care for eye irritation and pain in a single night.  These patients all received ocular inoculation with foam during the course of revelry, and over half were ultimately diagnosed with corneal abrasions.  The cause – the highly concentrated chemicals such as sodium lauryl sulfate and other proprietary mixtures similar to those found in soaps and shampoos.

So, beware the foam! (but not the FOAM).

“Party Alert: Here’s Foam in Your Eye”
http://www.ncbi.nlm.nih.gov/pubmed/24129456

“Notes from the Field: Eye Injuries Sustained at a Foam Party — Collier County, Florida 2012”
http://1.usa.gov/1d69xek

Watch & Wait For Stab Wounds

Thankfully, very few of us actually deal with these sorts of injuries on a regular basis – and even fewer of us are actually responsible for managing these injuries.

However, this is an important article out of USC pushing back against the trend towards utilizing CT for every traumatic injury possible.  There certainly seems, universally in medicine, to be a regression in reliance on the clinical examination along with a corresponding increased use of technology.  There are many reasons this occurs – convenience, patient satisfaction, and “zero-miss” mentality – and we’re just now fully accounting for the tremendous costs associated with this flawed evolution in practice.

In this study, all diagnostically equivocal abdominal stab wounds underwent a structured protocol including CT and observation.  Over a two-year period, 177 stable patients qualified for this protocol.  Overall, 87% were managed non-operatively – but, most importantly, clinical deterioration directed all necessary operative interventions, rather than CT findings.  Of the 23 patients who underwent operative intervention, 4 patients underwent operative intervention based solely on CT findings – and all four detected no injury during exploration.  The final test characteristics for CT were sensitivity of 31.3% and specificity of 84.2%.

I think these authors are entirely appropriate in describing the use of CT in abdominal stab wounds as inferior to clinical observation.  They don’t specifically emphasize the false positives from CT in their discussion, but these findings lead to real patient harms – even just in their small cohort.  One of the four CT-directed interventions underwent negative pericardial window for suspected hemopericardium – and suffered a peri-operative cardiac arrest due to complications from anesthesia.

Let’s try to avoid that.

“Prospective Evaluation of the Role of Computed Tomography in the Assessment of Abdominal Stab Wounds”
http://www.ncbi.nlm.nih.gov/pubmed/23824102

Ocorrafoo Cobange & Grace Groovy

There are many challenges in the world of scientific publishing.  I’ve spent time focusing on conflict-of-interest in peer review – but this is a fabulous exposé on a rapidly-growing segment of for-profit, predatory journals with no substantial peer-review process whatsoever.

This is more a journalistic project than a scientific study, but it details the results of an experiment performed to determine the rigor of peer review at a number of open-access journals.  Open access journals, in contrary to most journals, derive their operating income by charging authors a publication fee – rather than charging for print subscription, reprints, or individual article access.  Proponents of this model point to the beneficial effect open-access has on medical knowledge in countries without the means and wealth required to join the first-world scientific community.  However, this model has essentially been hijacked by editors who use a predatory process of fraudulent representation to run these journals solely for profit.

This journalist created a fake, flawed, and horribly written (he ran it through Google Translate into French, and back) molecular cancer therapeutics article and submitted it to over 300 open-access journals – half these journals were on a “blacklist” and the remainder were not.  Without completely re-iterating the entire story (or the mystery of Grace Groovy), the crux is – nearly every journal on the “blacklist” accepted the fictional article, while journals on the more reliable list rejected it about 2/3rds of the time.

Lovely piece of investigative journalist and an entertaining read – and an excellent description of a growing problem in the realm of scientific integrity.

“Who’s Afraid of Peer Review?”
http://www.sciencemag.org/content/342/6154/60.summary

Yet More Unnecessary Antibiotics

One would think educational efforts regarding the inefficacy of antibiotics for viruses would at some point take root.  I’m pretty sure it’s explicitly covered in our medical school curriculum that antibiotics are indicated specifically to treat infection caused by bacteria.  Despite this, however, the overwhelming evidence is that clinicians have somehow forgotten these basic fundamentals of medicine.
This is a research letter reviewing the National Ambulatory Medical Care Survey and National Hospital Ambulatory Medical Care Survey (NAMCS,NHAMCS) databases for visits to primary care or Emergency Departments for code 1455: “sore throat”.  Considering the prevalence of group A strep infection is about 10%, and a small number of additional cases are pathogenic bacterial infections requiring acute treatment, I suppose you would expect the rate of antibiotic prescribing to be quite low?
Is 60% the number you had in mind?
U.S. physicians have been prescribing antibiotics at a mostly steady rate of 60% of visits for sore throat for over a decade.  Not only are they prescribing antibiotics, almost half of prescriptions are for non-ß-lactam antibiotics – including a huge proportion of azithromycin, as if there weren’t enough macrolide-resistant streptococci out there.  Beyond that, a full 15% were not even close to options on the recommended list for acute sore throat, such as fluoroquinolones.
When smart folks like David Newman are calling for the end of routine treatment of strep throat, certainly we are way off base with a 60% rate of treatment.  Forget about OP-15 – we should have a quality measure based on rates of antibiotic prescription for sore throat and ear pain.
“Antibiotic Prescribing to Adults With Sore Throat in the United States, 1997-2010”

Which Pulmonary Emboli Are Missed?

Apparently, as many as one-third of them!

This is a retrospective study from a Spanish hospital evaluating all patients presenting through the Emergency Department who subsequently received a chest CTA revealing pulmonary embolism.  These diagnoses were further classified as having received the diagnosis of PE on initial presentation, during hospitalization, or on a return visit to the Emergency Department.  66% of patients diagnosed with PE were diagnosed on the initial visit, while 22% were diagnosed only after hospital admission, and 12% on Emergency Department revisit.  This leads to the authors conclusions that delayed diagnosis of acute PE is frequent despite current diagnostic strategies.

While it’s only a single center study, and the frequency of missed diagnoses may not be generalizable, it’s still a reasonable investigation.  The characteristics of patients with missed PE fit the typical spectrum from other, prior studies: confounding comorbidities and diagnostic findings.  Patients with delayed diagnosis had fewer typical features, were more likely to have COPD or asthma, more likely to have fever, and more likely to have pulmonary infiltrates.  The authors state there were no mortality differences between early and delayed PE diagnosis, but the study is too small and heterogenous to truly put much faith in this observation.  Of note, 41% of patients who were initially discharged from the ED had unilateral subsegmental clot, a far greater proportion than either other diagnostic group.

It certainly makes sense that patients with dyspnea and other potential causes will have their diagnosis delayed until their lack of response to therapy results in reassessment.  These authors suggest we ought to be more aggressive in our evaluation for PE in the Emergency Department; I tend to feel the delayed diagnosis in confounding situations is appropriate, and suspect some of these represent subclinical disease.  “Zero-miss” is only appropriate if the harms from the disease outweigh the harms of testing and treatment – and follow-up re-evaluation or additional testing during acute hospitalization are reasonable pathways to diagnosis in a subset of patients.

“Clinical features of patients inappropriately undiagnosed of pulmonary embolism”
http://www.ncbi.nlm.nih.gov/pubmed/24060320

Replace Us With Computers!

In a preview to the future – who performs better at predicting outcomes, a physician, or a computer?

Unsurprisingly, it’s the computer – and the unfortunate bit is we’re not exactly going up against Watson or the hologram doctor from the U.S.S. Voyager here.

This is Jeff Kline, showing off his rather old, not terribly sophisticated “attribute matching” software.  This software, created back in 2005-ish, is based off a database he created of acute coronary syndrome and pulmonary embolism patients.  He determined a handful of most-predictive variables from this set, and then created a tool that allows physicians to input those specific variables from a newly evaluated patient.  The tool then finds the exact matches in the database and spits back a probability estimate based on the historical reference set.

He sells software based on the algorithm and probably would like to see it perform well.  Sadly, it only performs “okay”.  But, it beats physician gestalt, which is probably better ranked as “poor”.  In their prospective evaluation of 840 cases of acute dyspnea or chest pain of uncertain immediate etiology, physicians (mostly attendings, then residents and midlevels) grossly over-estimated the prevalence of ACS and PE.  Physicians had a mean and median pretest estimate for ACS of 17% and 9%, respectively, and the software guessed 4% and 2%.  Actual retail price:  2.7%.  For PE, physicians were at mean 12% and median 6%, with the software at 6% and 5%.  True prevalence: 1.8%.

I don’t choose this article to highlight Kline’s algorithm, nor the comparison between the two.  Mostly, it’s a fascinating observational study of how poor physician estimates are – far over-stating risk.  Certainly, with this foundation, it’s no wonder we’re over-testing folks in nearly every situation.  The future of medicine involves the next generation of similar decision-support instruments – and we will all benefit.

“Clinician Gestalt Estimate of Pretest Probability for Acute Coronary Syndrome and Pulmonary Embolism in Patients With Chest Pain and Dyspnea.”
http://www.ncbi.nlm.nih.gov/pubmed/24070658

The Grim Outcomes Brigade

This study aims to answer the critically important question we want to answer for every single patient – if we send them home, will they come back dead?

This is a retrospective medical record review of Kaiser health plan patients presenting to Kaiser Permanente hospitals in California who were discharged from the Emergency Department – and then died within 7 days.  They excluded hospice, DNR, and DNI patients from this analysis, and identified 446,120 discharges over the course of two years resulting in 203 deaths (0.05%).  These authors performed a qualitative chart review of 61 patients evaluating for common features.

A few highlights from their common themes:

  • Unexplained persistent acute mental status change – probably a difficult discharge to justify.
  • Documentation of “ill-appearing” or “moderate distress” – certainly some folks improve, but not the 89 year-old reviewed by these authors.
  • Abnormal vital signs – the classic cautionary tale, which ought to have been easier in a hypoxic 69 year-old.
  • Misdiagnoses due to inadequate differential diagnosis – the other classic, unavoidable, human error.
  • Admission plan changed – a potentially risky play to discharge when another physician feels a patient warrants admission.

This is simply a basic descriptive study.  The elements described herein should not be implied to be specific causal factors, but hypothesis generating for future work.  It is, however, a fascinating topic and part of the ultimate question at the root of nearly all our patient encounters.

“Qualitative Factors in Patients Who Die Shortly After Emergency Department Discharge”

http://www.ncbi.nlm.nih.gov/pubmed/24033620

The “Ottawa SAH Rule”

This is a rather dangerous article for many reasons.  Firstly, it’s published in a high-impact journal and received a fair bit of coverage in the news media.  Secondly, it concludes its discussion by suggesting this ought to be adopted as a standardized rule for the evaluation of acute headache – this isn’t just a descriptive study on features of subarachnoid hemorrhage, it’s been given an official-sounding title, the “Ottawa SAH Rule”.  Because of this, there’s significant potential for the rule described here to be adopted as widespread practice.

Therefore – it better be nearly perfect.

This is a prospective cohort from 10 university-affiliated Canadian hospitals.  They looked at non-traumatic headaches reaching maximal intensity within 1 hour not part of a recurrent headache syndrome, and found 132 patients with SAH out of 2,131 assessed.  They specifically gathered information on three previously-derived prediction rules and found none of them were 100% sensitive – so they chose the required elements from each to reach 100% (95% CI 97.2-100) sensitivity.  The cost of this 100% sensitivity?  Degeneration of specificity from the 28-35% in the three individual rules down to 15% (95% CI 13.8-16.9) in the final rule.  The authors observed application of the derived rule would have decreased investigations for SAH from 84% of the enrolled cohort down to 74% of the enrolled cohort, and thusly their rule is superior to routine clinical practice by maintaining 100% sensitivity while decreasing resource utilization.

I think their inclusion criteria are fine – a rapid-onset, severe, atraumatic headache is the classical population of interest.  Patients without this feature have such a low incidence of SAH that it’s unreasonable to evaluate for it.  Their outcome measures, unfortunately, were a little softer.  The positive diagnoses are reasonable – CT proven SAH or positive lumbar puncture with a source feature on cerebral angiography.  However, only 82% underwent CT and 39% underwent LP, with a six month telephone follow-up – and a small number were lost to follow-up.  Many would argue that CT alone is not sufficient for ruling out SAH, and 6-month survival is a limited proxy.  This weakens its claim for 100% sensitivity.

Then, of course, a 15% specificity is awful.  This isn’t necessarily a criticism of the authors, but more a recognition of the limitation of distilling diverse clinical data into concise decision instruments.  19 different patient features were significantly different between the SAH and no-SAH groups; reducing this to just 6 features discards so much information that an instrument designed for a complex clinical prediction is bound to fail.  There were 1,694 false positives by the rule compared with 132 true positives.  If this rule is applied without the strict exclusion criteria specified in the publication, there may be a huge number of inappropriate investigations.

Then, the rate of investigation comparison is also probably invalid.  These institutions underwent specific 1-hour orientation to the study being performed and were actively involved in gathering clinical data for the study.  I’m certain the 84% rate of investigation observed was conflated by the ongoing research at hand.  The previous Perry study from 2011 had an evaluation rate of 57%, so it’s hard for me to believe the statistics from the current publication.

Finally, the kappa values for inter-observer agreement were rather mixed.  Based on only sixty cases where two physicians evaluated the same patient, four of the six final rule elements had kappas between 0.44 and 0.59, representing only moderate agreement.  This is a significant threat to the internal validity of the underlying data in support of their rule.

Overall, yes – the elements they identify through their observational cohort are likely to capture most cases of SAH.  However, the limitations of this study and the poor specificity make me reluctant to buy in completely – and certainly not adopt it as a standardized “rule”.

“Clinical Decision Rules to Rule Out Subarachnoid Hemorrhage for Acute Headache”
http://www.ncbi.nlm.nih.gov/pubmed/24065011

Death From a Thousand Clicks

The modern physician – one of the most highly-skilled, highly-compensated data-entry technicians in history.

This is a prospective, observational evaluation of physician activity in the Emergency Department, focusing mostly the time spent in interaction with the electronic health record.  Specifically, they counted mouse clicks during various documentation, order-entry, and other patient care activities.  The observations were conducted for 60-minute time periods, and then extrapolated out to an entire shift, based on multiple observations.

The observations were taken from a mix of residents, attendings, and physician extenders, and offer a lovely glimpse into the burdensome overhead of modern medicine: 28% of time was spent in patient contact, while 44% was spent performing data-entry tasks.  It requires 6 clicks to order an aspirin, 47 clicks to document a physical examination of back pain, and 187 clicks to complete an entire patient encounter for an admitted patient with chest pain.  This extrapolates out, at a pace of 2.5 patients per hour, to ~4000 clicks for a 10-hour shift.

The authors propose a more efficient documentation system would result in increased time available for patient care, increased patients per hour, and increased RVUs per hour.  While the numbers they generate from this sensitivity analysis for productivity increase are essentially fantastical, the underlying concept is valid: the value proposition for these expensive, inefficient electronic health records is based on maximizing reimbursement and charge capture, not by empowering providers to become more productive.

The EHR in use in this study is McKesson Horizon – but, I’m sure these results are generalizable to most EHRs in use today.

4000 Clicks: a productivity analysis of electronic medical records in a community hospital ED”
http://www.ncbi.nlm.nih.gov/pubmed/24060331