It’s Sepsis-Harassment!

The computer knows all in modern medicine. The electronic health record is the new Big Brother, all-seeing, never un-seeing. And it sees “sepsis” – a lot.

This is a report on the downstream effects of an electronic sepsis alert system at an academic medical center. Their sepsis alert system was based loosely on the systemic inflammatory response syndrome for the initial warning to nursing staff, followed by additional alerts triggered by hypotension or elevated lactate. These alerts prompted use of sepsis order sets or triggering of internal “sepsis alert” protocols. Their outcomes of interest in their analysis were length-of-stay and in-hospital mortality.

At first glance, the alert appears to be a success – length of stay dropped from 10.1 days to 8.6, and in-hospital mortality from 8.5% to 7.0%. It would have been quite simple to stop there and trumpet these results as favoring the alerts, but the additional analyses performed by these authors demonstrate otherwise. In the case of both length-of-stay and mortality, both of those measures were trending downward independently regardless of the intervention, and in their adjusted analyses, none of the improvements could be conclusively tied to the sepsis alerts – and some relating to diagnoses of less-severe cases of sepsis probably prompted by the alert itself.

What is not debatable, however, is the burden on clinicians and staff. During their ~2.5 year study period, the sepsis alerts were triggered 97,216 times – 14,207 of which in the 2,144 subsequently receiving a final diagnosis of sepsis. The SIRS-based alerts comprised most (83,385) of these alerts, but only captured 73% of those with an ultimate diagnosis of sepsis, while having only a 13% true positive rate. The authors’ conclusion gets it right:

Our results suggest that more sophisticated approaches to early identification of sepsis patients are needed to consistently improve patient outcomes.

“Impact of an emergency department electronic sepsis surveillance system on patient mortality and length of stay”

No Change in Ordering Despite Cost Information

Everyone hates the nanny state. When the electronic health record alerts and interrupts clinicians incessantly with decision-“support”, it results in all manner of deleterious unintended consequences. Passive, contextual decision-support has the advantage of avoiding this intrusiveness – but is it effective?

It probably depends on the application, but in this trial, it was not. This is the PRICE (Pragmatic Randomized Introduction of Cost data through the Electronic health record) trial, in which 75 inpatient laboratory tests were randomized to display of usual ordering, or ordering with contextual Medicare cost information. The hope and study hypothesis was the availability of this financial interest would exert a cultural pressure of sorts on clinicians to order fewer tests, particularly those with high costs.

Across three Philadelphia-area hospitals comprising 142,921 hospital admissions in a two-year study period, there were no meaningful differences in lab tests ordered per patient day in the intervention or the control. Looking at various subgroups of patients, it is also unlikely there were particularly advantageous effects in any specific population.

Interestingly, one piece of feedback the authors report is the residents suggest most of their routine lab test ordering resulted from admission order sets. “Routine” daily labs are set in motion at the time of admission, not part of a daily assessment of need, and thus a natural impediment to improving low-value testing. However, the authors also note – and this is probably most accurate – because the cost information was displayed ubiquitously, physicians likely became numb to the intervention. It is reasonable to expect substantially more selective cost information could have focused effects on an adea of particularly high cost or low-value.

“Effect of a Price Transparency Intervention in the Electronic Health Record on Clinician Ordering of Inpatient Laboratory Tests”

Oh, The Things We Can Predict!

Philip K. Dick presented us with a short story about the “precogs”, three mutants that foresaw all crime before it could occur. “The Minority Report” was written in 1956 – and, now, 60 years later we do indeed have all manner of digital tools to predict outcomes. However, I doubt Steven Spielberg will be adapting a predictive model for hospitalization for cinema.

This is a rather simple article looking at a single-center experience at using multivariate logistic regression to predict hospitalization. This differs, somewhat, from the existing art in that it uses data available at 10, 60, and 120 minutes from the arrival to the Emergency Department as the basis for its “progressive” modeling.

Based on 58,179 visits ending in discharge and 22,683 resulting in hospitalization, the specificity of their prediction method was 90% with a sensitivity or 96%,for an AUC of 0.97. Their work exceeds prior studies mostly on account of improved specificity, compared with the AUCs of a sample of other predictive models generally between 0.85 and 0.89.

Of course, their model is of zero value to other institutions as it will overfit not only on this subset of data, but also the specific practice patterns of physicians in their hospital. Their results also conceivably could be improved, as they do not actually take into account any test results – only the presence of the order for such. That said, I think it is reasonable to suggest similar performance from temporal models for predicting admission including these earliest orders and entries in the electronic health record.

For hospitals interested in improving patient flow and anticipating disposition, there may be efficiencies to be developed from this sort of informatics solution.

“Progressive prediction of hospitalisation in the emergency department: uncovering hidden patterns to improve patient flow”

Excitement and Ennui in the ED

It goes without saying some patient encounters are more energizing and rewarding than others.  As a corollary, some chief complaints similarly suck the joy out of the shift even before beginning the patient encounter.

This entertaining study simply looks for any particular time differential relating to physician self-assignment on the electronic trackboard between presenting chief complaints.  The general gist of this study would be that time-to-assignment reflects a surrogate of a composite of prioritization and/or desirability.

These authors looked at 30,382 presentations unrelated to trauma activations, and there were clear winners and losers.  This figure of the shortest and longest 10 complaints is a fairly concise summary of findings:

door to eval times

Despite consistently longer self-assignment times for certain complaints, the absolute difference in minutes is still quite small.  Furthermore, there are always issues with relying on these time stamps, particularly for higher-acuity patients; the priority of “being at the patient’s bedside” always trumps such housekeeping measures.  I highly doubt ankle sprains and finger injuries are truly seen more quickly than overdoses and stroke symptoms.

Vaginal bleeding, on the other hand … is deservedly pulling up the rear.

“Cherry Picking Patients: Examining the Interval Between Patient Rooming and Resident Self-assignment”

Informatics Trek III: The Search For Sepsis

Big data!  It’s all the rage with tweens these days.  Hoverboards, Yik Yak, and predictive analytics are all kids talk about now.

This “big data” application, more specifically, involves the use of an institutional database to derive predictors for mortality in sepsis.  Many decision instruments for various sepsis syndromes already exist – CART, MEDS, mREMS, CURB-65, to name a few – but all suffer from the same flaw: how reliable can a rule with just a handful of predictors be when applied to the complex heterogeneity of humanity?

Machine-learning applications of predictive analytics attempt to create, essentially, Decision Instruments 2.0.  Rather than using linear statistical methods to simply weight a small handful of different predictors, most of these applications utilize the entire data set and some form of clustering.  Most generally, these models replace typical variable weighted scoring with, essentially, a weighted neighborhood scheme, in which similarity to other points helps predict outcomes.

Long story short, this study out of Yale utilized 5,278 visits for acute sepsis and a random forest model to create a training set and a validation set.  The random forest model included all available data points from the electronic health record, while other models used up to 20 predictors based on expert input and prior literature.  For their primary outcome of predicting in-hospital death, the AUC for the random forest model was 0.86 (CI 0.82-0.90), while none of the rest of the models exceeded an AUC of 0.76.

This still simply at the technology demonstration phase, and requires further development to become actionable clinical information.  However, I believe models and techniques like this are our next best paradigm in guiding diagnostic and treatment decisions for our heterogenous patient population.  Many challenges yet remain, particularly in the realm of data quality, but I am excited to see more teams engaged in development of similar tools.

“Prediction of In-hospital Mortality in Emergency Department Patients with Sepsis: A Local Big Data Driven, Machine Learning Approach”

Hi Ur Pt Has AKI For Totes

Do you enjoy receiving pop-up alerts from your electronic health record?  Have you instinctively memorized the fastest series of clicks to “Ignore”?  “Not Clinically Significant”?  “Benefit Outweighs Risk”?  “Not Sepsis”?

How would you like your EHR to call you at home with more of the same?

Acute kidney injury, to be certain, is associated with poorer outcomes in the hospital – mortality, dialysis-dependence, and other morbidities.  Therefore, it makes sense – if an automated monitoring system can easily detect changes and trends, why not alert clinicians to such changes, and nephrotoxic therapies could be avoided.

Interestingly – for both good and bad – the outcomes measured were patient-oriented, randomizing 2393 patients to either “usual care” or text message alerts for changes in serum creatinine.  The goal, overall, was detection of reductions in death, dialysis, or progressive AKI.  While patient-oriented outcomes are, after all, the most important outcomes in medicine – it’s only plausible to improve outcomes if clinicians improve care.  Therefore, measuring the most direct consequence of the intervention might be a better outcome – renal-protective changes in clinician behavior.

Because, unfortunately, despite sending text messages and e-mails directly to responsible clinicians and pharmacists – the only notable change in behavior between the “alert” group and “usual care group” was increased monitoring of serum creatinine.  Chart documentation of AKI, avoidance of intravenous contrast, avoidance of NSAIDs, and other renal-protective behaviors were unchanged, excepting a non-significant trend towards decreased aminoglycoside use.

No change in behavior, no change in outcomes.  Text messages and e-mails alerts!  Can shock collars be far behind?

“Automated, electronic alerts for acute kidney injury: a single-blind, parallel-group, randomised controlled trial”

Replace Us With Computers!

In a preview to the future – who performs better at predicting outcomes, a physician, or a computer?

Unsurprisingly, it’s the computer – and the unfortunate bit is we’re not exactly going up against Watson or the hologram doctor from the U.S.S. Voyager here.

This is Jeff Kline, showing off his rather old, not terribly sophisticated “attribute matching” software.  This software, created back in 2005-ish, is based off a database he created of acute coronary syndrome and pulmonary embolism patients.  He determined a handful of most-predictive variables from this set, and then created a tool that allows physicians to input those specific variables from a newly evaluated patient.  The tool then finds the exact matches in the database and spits back a probability estimate based on the historical reference set.

He sells software based on the algorithm and probably would like to see it perform well.  Sadly, it only performs “okay”.  But, it beats physician gestalt, which is probably better ranked as “poor”.  In their prospective evaluation of 840 cases of acute dyspnea or chest pain of uncertain immediate etiology, physicians (mostly attendings, then residents and midlevels) grossly over-estimated the prevalence of ACS and PE.  Physicians had a mean and median pretest estimate for ACS of 17% and 9%, respectively, and the software guessed 4% and 2%.  Actual retail price:  2.7%.  For PE, physicians were at mean 12% and median 6%, with the software at 6% and 5%.  True prevalence: 1.8%.

I don’t choose this article to highlight Kline’s algorithm, nor the comparison between the two.  Mostly, it’s a fascinating observational study of how poor physician estimates are – far over-stating risk.  Certainly, with this foundation, it’s no wonder we’re over-testing folks in nearly every situation.  The future of medicine involves the next generation of similar decision-support instruments – and we will all benefit.

“Clinician Gestalt Estimate of Pretest Probability for Acute Coronary Syndrome and Pulmonary Embolism in Patients With Chest Pain and Dyspnea.”

Death From a Thousand Clicks

The modern physician – one of the most highly-skilled, highly-compensated data-entry technicians in history.

This is a prospective, observational evaluation of physician activity in the Emergency Department, focusing mostly the time spent in interaction with the electronic health record.  Specifically, they counted mouse clicks during various documentation, order-entry, and other patient care activities.  The observations were conducted for 60-minute time periods, and then extrapolated out to an entire shift, based on multiple observations.

The observations were taken from a mix of residents, attendings, and physician extenders, and offer a lovely glimpse into the burdensome overhead of modern medicine: 28% of time was spent in patient contact, while 44% was spent performing data-entry tasks.  It requires 6 clicks to order an aspirin, 47 clicks to document a physical examination of back pain, and 187 clicks to complete an entire patient encounter for an admitted patient with chest pain.  This extrapolates out, at a pace of 2.5 patients per hour, to ~4000 clicks for a 10-hour shift.

The authors propose a more efficient documentation system would result in increased time available for patient care, increased patients per hour, and increased RVUs per hour.  While the numbers they generate from this sensitivity analysis for productivity increase are essentially fantastical, the underlying concept is valid: the value proposition for these expensive, inefficient electronic health records is based on maximizing reimbursement and charge capture, not by empowering providers to become more productive.

The EHR in use in this study is McKesson Horizon – but, I’m sure these results are generalizable to most EHRs in use today.

4000 Clicks: a productivity analysis of electronic medical records in a community hospital ED”

New South Wales Dislikes Cerner

The grass is clearly greener on the other side for these folks at Nepean Hospital in New South Wales, AUS.  This study details the before-and-after Emergency Department core measures as they transitioned from the EDIS system to Cerner’s FirstNet.  As they state in their introduction, “Despite limited literature indicating that FirstNet has decreased performance” and “reports of problems with Cerner programs overseas”, FirstNet was foisted upon them – so it’s clear they have an agenda with this publication.

And, a retrospective, observational study is the perfect vehicle for an agenda.  You pick the criteria you want to measure, the most favorable time period, and voilà!  These authors picked a six month pre-intervention period and a six-month post-intervention period.  Triage categories were similar for that six month period.  And then…they present data on a three-month subset.  Indeed, all their descriptive statistics are of only a three-month subset excepting ambulance offload waiting time – for which they have full six month data.  Why choose a study period fraught with missing data?

Then, yes, by every measure they are less efficient at seeing patients with the Cerner product.  The FirstNet system had been in place for six months by the time they report data – but, it’s still not unreasonable to suggest they’re somewhat suffering the growing pains of inexperience.  Then, they also understaff the ED by 3.2 resident shifts and 3.5 attending shifts per week.  An under-staffed ED for a relatively new implementation of a product with low physician acceptance?  

As little love I have for Cerner FirstNet, I’m not sure this study gives it a fair shot.

Effect of an electronic medical record information system on emergency department performance”