Ryan Radecki

When ChatGPT Writes a Research Paper

It is safe to say the honeymoon phase of large language models has started to fade a bit. Yes, they can absolutely pass a medical licensing examination when given carefully constructed prompts. The focus now turns to practical applications – like, in this example, using ChatGPT to write an entire scientific paper for you!

There is no reason to go through the details of the paper, the content, the findings, or any aspect of fruit and vegetable consumption. It is linked only to prove that it exists, and was written in its entirety by an LLM. To create the article, the authors used prompts containing the actual data set, prompts for an introduction, summary tables, and a discussion – impressively, as part of an automated prompting engine written by the authors, not just a laborious manual process. The initial output was not, as you might expect, entirely appropriate, requiring substantial re-prompting and revision – but, in the end, as you may see, the output resembles a paper basically indistinguishable from an undergraduate or graduate student-level output.

There were, of course, hallucinations, banal unfounded declarations, and the expected simply fabricated references. But, considering a year or two ago, no one would have ever talked about or suggested a LLM could write any semblance of a robust research paper, this is still fairly amazing. Considering this sort of writing is close to peak intellectual accomplishment, it’s fair to say similar automated techniques may replace a great deal of lesser content generation.

“The Impact of Fruit and Vegetable Consumption and Physical Activity on Diabetes Risk among Adults”
https://www.nature.com/articles/d41586-023-02218-z

Anchoring on Bias

The results of this paper are hardly surprising, since the witnessed phenomenon – “anchoring bias” – exists as defined. However, it’s always fun to see it demonstrated objectively.

In this little piece of research, authors collated four years of encounters to Veterans Affairs emergency departments in the U.S. and parsed out the triage reason between “congestive heart failure” versus all others. These two groups were then compared regarding the rates of objective testing for pulmonary embolism, frequency of ordering B-type natiuretic peptide, and both initial and 30-day diagnoses of pulmonary embolism.

As the title suggests, the authors identify differences in testing associated with the recorded reason for visit – with less frequent testing for PE, increased confirmatory testing for CHF, and fewer diagnoses of PE at the initial visit. However, the 30-day rate of diagnosis for PE was the same between the two groups – 1.2% in those initially presenting for reason of CHF, and 1.1% for all others.

The implication suggested by these authors is the subsequent similar frequency of PE at 30 days represent a delayed or missed initial diagnosis, with the culprit being an element of cueing from the patient triage reason or other elements of medical history. This is obviously not a study design with the ability to conclusively demonstrate such a causative effect; a prospective design randomizing patients with an initial “CHF” reasons for visit to an alternative such as “shortness of breath” would tease out this effect. That said, this likely still represents an undercurrent of anchoring bias.

“Evidence for Anchoring Bias During Physician Decision-Making”
https://jamanetwork.com/journals/jamainternalmedicine/article-abstract/2806464

Fall Recap

It is the long, cold dark here in Christchurch – improved dramatically by leaving for the U.S. for four weeks!

Firstly, the blog may be making a bit of a comeback – the ugly demise of Twitter seems to necessitate a better method of knowledge translation, such as blog posts that can be replicated across whichever platform is progressing towards dominance.

Next, of course, the Annals of Emergency Medicine Podcast continues apace. We’ve had two excellent co-hosts these past months whose background is far more diverse than ourselves, and we will be continuing to feature additional guests in coming months.

What have I been putting into ACEPNow?

Lastly, the Annals of Emergency Medicine Journal Club features important articles from outside the Emergency Medicine literature:

Summer Recap

Down here, summer has ended – although, you wouldn’t know it from the 26C weather we’re having outside today.

But, this means it’s been a few months since I’ve linked to my various #FOAMed resources around the web.

First, and not least, the Annals of Emergency Medicine Podcast, the Ryan and Rory Show, recapping the articles from each month’s issue, available for free on your choice of streaming platforms:

Then, there’s always something to learn from ACEP Now!

Finally, not every article relevant to Emergency Medicine lands in an EM journal – hence the Annals of Emergency Medicine Journal Club. Here are a few of the highlights from around the remaining published literature we’ve looked at recently:

Finally, a Twitter thread with slides illustrating some of the top articles of 2022:

20 Papers in 20 Minutes from ACEM Spring Symposium

Enjoy!

Winter Recap

Spring is here down in this nuclear-free hemisphere. This blog is still effectively in stasis – but the productivity continues elsewhere!

Don’t forget the Annals of Emergency Medicine Podcast, a lighthearted feel-good romantic comedy with Rory Spiegel, available for free on your choice of streaming platforms:

Bimonthly #FOAMed in ACEPNow:

And, lastly, everyone’s favorite part of their residency curriculum, the Annals of Emergency Medicine Journal Club – in which we update folks from the wider world of medical literature in concise summaries for emergency medicine practice:

Enjoy!

It’s Not OK To Let 25% of tPA Cases Be Stroke Mimics

With all the various competing interests for time, it’s rare to find an article of sufficient note to warrant its own blog post. A notable publication might get a short tweet thread. Collections of other literature find their way into ACEPNow articles or the odd Annals of Emergency Medicine Journal Club. But, every once in awhile, there’s something … else.

This article pertains to the practice of telestroke administration of thrombolysis for acute ischemic stroke. In major hospital centers, there may be in-house neurology hospitalists or stroke and vascular specialists, and the expertise for management of stroke is readily at the bedside. In many community, regional, and rural hospitals, these resources are unavailable – except by telestroke evaluation. These common arrangements allow access to neurology expertise, followed potentially by interhospital transfer.

In this article, the authors review a series of 270 patients receiving intravenous thrombolysis following evaluation via telestroke. Most patients underwent MRI with DWI following transfer to the hub stroke center, while a handful did not – probably those with serious complications arising from stroke, and those with obvious stroke mimic etiologies. Patients otherwise were categorized as a stroke if a lesion was found on MRI with DWI, but could be deemed a TIA or a stroke mimic if no lesion was seen.

Not-so astonishingly, they report 23.7% of their series are stroke mimics. Another ~5% are TIA, another diagnosis for which there is no indication for thrombolysis. While this much collateral damage might horrify some, this sort of blanket use of thrombolytics is routine in the United States, if not encouraged. The proof of such encouragement is evidence in these authors’ Discussion section, with this interpretation of recent guidelines:

In fact, the most recent AHA guidelines in 2019 recognise this and specifically recommend thrombolysis to SM given the low rate of sICH and state that starting IVtPA is preferred over delaying treatment to pursue additional diagnostic studies.

Naturally, the authors go on to propose a threshold of reasonable practice for which their performance fits comfortably within:

In our academic tertiary referral telestroke programme, 23.7% of patients administered thrombolysis had a final diagnosis of SM. We suggest that a reasonable SM thrombolysis rate for telestroke programme should be one in four, similar to the accepted negative appendectomy rate, as that the risk of overtreatment should be accepted over the risk of undertreatment.

This is, of course, nonsensical. Leaving aside their entirely specious comparison to an acceptable negative appendectomy rate, let us ruminate seriously on the response to a poorly performing process being to normalize the poor performance. The authors rightfully cite Jeff Saver’s general musings that, given the advancing state of the specialty, the acceptable stroke mimic rate ought to be around 3%. They then justify their absurdly higher total by noting a small portion – about 7-10% – of eligible strokes are missed for treatment, and it is rather the better practice to simply treat any potential stroke in order not to miss a single one.

Again, this perspective hinges primarily on the concept treating stroke mimics with thrombolysis is “harmless“, owing to a rate of sICH of merely ~0.5-1%. While this is still an unacceptable perspective towards inducing sICH in an otherwise unsuspecting patient, the other harms for thrombolysis in stroke mimics include:

Diagnostic inertia, in which evaluation and treatment for the true cause of neurologic dysfunction is delayed.
Permanent misdiagnosis, in which a patient treated with thrombolysis, improves, and is labelled an “aborted stroke”. They now carry the diagnosis of prior stroke, making it potentially more difficult to obtain health insurance, not to mention likely unnecessarily being prescribed medications for secondary prevention of stroke.
Financial harms from being treated with thrombolysis, which typically requires extended monitoring in a critical care or stroke unit, far exceeding the costs associated with a non-stroke hospitalization.

In short, this is a grossly unacceptable perspective endorsing, frankly, reckless use of thrombolysis. These authors should reconsider the primarily literature they are citing as justification and the framing of their argument, and retract their call to normalize these poorly performing clinical systems.

“Thrombolysis of stroke mimics via telestroke”
https://svn.bmj.com/content/7/3/267

April Update

Just a quick update to the blog to collate various items from around the web.

The Annals of Emergency Medicine monthly podcast is updated through February 2022, freely available from your choice of services:

Likewise, the Annals of Emergency Medicine Journal Club is freely available:

Finally, a couple more pieces from ACEPNow, highlighting recent scientific developments and my experience in a universal healthcare system:

2021 Wrap-Up

A few items to collate from the last several months’ efforts.

The Annals of Emergency Medicine Podcast continues apace, with free monthly updates from the original research published in the journal:

Likewise, the Annals of Emergency Medicine Journal Club has published several monthly installments:

Two more pieces in ACEPNow:

And, finally, from a talk I gave our ACEM trainees – the list of included articles, highlighting some of the most interesting articles published in 2021:

The Use of Tranexamic Acid to Reduce the Need for Nasal Packing in Epistaxis (NoPAC): Randomized Controlled Trial
No advantage to routine use of topical TXA for epistaxis.
https://doi.org/10.1016/j.annemergmed.2020.12.013

Ultra-early tranexamic acid after subarachnoid haemorrhage (ULTRA): a randomised controlled trial
No advantage to routine use of IV TXA for aneurysmal SAH.
https://doi.org/10.1016/S0140-6736(20)32518-6

Effect of Endovascular Treatment Alone vs Intravenous Alteplase Plus Endovascular Treatment on Functional Independence in Patients With Acute Ischemic Stroke
Stopped early due poor outcomes in patients receiving alteplase prior to endovascular therapy.
https://jamanetwork.com/journals/jama/fullarticle/10.1001/jama.2020.23523

A Randomized Trial of Intravenous Alteplase before Endovascular Treatment for Stroke
Heterogenous outcomes showing a small advantage, primarily recanalization, in patients receiving alteplase prior to endovascular therapy.
https://doi.org/10.1056/NEJMoa2107727

Effect of Mechanical Thrombectomy Without vs With Intravenous
Thrombolysis on Functional Outcome Among Patients With Acute Ischemic Stroke
No reliable differences between patients regardless of therapy.
https://doi.org/10.1001/jama.2020.23522

Prospective, Multicenter, Controlled Trial of Mobile Stroke Units
A “mobile stroke unit” administered tPA more rapidly, demonstrating an association with improved outcomes – the entire effect size made up of “Stroke reversed by tPA”.
https://doi.org/10.1056/NEJMoa2103879

Effect of Intravenous Fluid Treatment With a Balanced Solution vs 0.9% Saline Solution on Mortality in Critically Ill Patients
No patient-oriented difference in outcomes regardless of fluid choice, although resuscitation volumes were not excessive.
https://doi.org/10.1001/jama.2021.11684

Short-Course Antimicrobial Therapy for Pediatric Community-Acquired Pneumonia
5 days of high-dose amoxicillin was no different than 10 days of high-dose amoxicillin.
https://doi.org/10.1001/jamapediatrics.2020.6735

Effect of Amoxicillin Dose and Treatment Duration on the Need for Antibiotic Re-treatment in Children With Community-Acquired Pneumonia
No difference between 3 days vs. 7 days, nor between high-dose or low-dose amoxicillin.
https://doi.org/10.1001/jama.2021.17843

Delayed Antibiotic Prescription for Children With Respiratory Infections: A Randomized Trial
“Delayed” antibiotic prescribe was a safe strategy for reducing inappropriate antibiotic treatment – but so was “no” antibiotics.
https://doi.org/10.1542/peds.2020-1323

Effect of Oral Moxifloxacin vs Intravenous Ertapenem Plus Oral Levofloxacin for Treatment of Uncomplicated Acute Appendicitis
Outcomes in patients with appendicitis managed with antibiotics were similar regardless of whether patients began with oral antibiotics or started with intravenous and then transitions to oral.
https://doi.org/10.1001/jama.2020.23525

Antibiotics versus Appendectomy for Acute Appendicitis — Longer-Term Outcomes
Within 90 days, 29% of patients managed with antibiotics underwent appendectomy. At 1 year, 46%; 2 years, 46%, 3 and 4 years, 49%.
https://doi.org/10.1056/NEJMc2116018

Effect of Use of a Bougie vs Endotracheal Tube With Stylet on Successful Intubation on the First Attempt Among Critically Ill Patients Undergoing Tracheal Intubation
First-pass intubation success was ~83% for trainees using video laryngoscopy, regardless of using bougie or stylet for ET tube.
https://doi.org/10.1001/jama.2021.22002

Effect of Moderate vs Mild Therapeutic Hypothermia on Mortality and Neurologic Outcomes in Comatose Survivors of Out-of-Hospital Cardiac Arrest
31°C was no better than 34°C for improving neurologic outcomes following OHCA.
https://doi.org/10.1001/jama.2021.15703

Hypothermia versus Normothermia after Out-of-Hospital Cardiac Arrest
Hypothermia, under the conditions typically implemented in major centers, did not improve neurologic outcomes following OHCA.
https://doi.org/10.1056/NEJMoa2100591

Angiography after Out-of-Hospital Cardiac Arrest without ST-Segment Elevation
An RCT showing no advantage to routine immediate angiography in non-STEMI OHCA.
https://doi.org/10.1056/NEJMoa2101909

Pathway with single-dose long-acting intravenous antibiotic reduces emergency department hospitalizations of patients with skin infections
A sponsor encouraging discharge of patients with SSTI results in discharge of patients with SSTI.
https://doi.org/10.1111/acem.14258

Self-obtained vaginal swabs are not inferior to provider- performed endocervical sampling for emergency department diagnosis of Neisseria gonorrhoeae and Chlamydia trachomatis
A woman can self-swab for STI every bit as effectively as a clinician performing a pelvic examination.
https://doi.org/10.1111/acem.14213

Invasive Bacterial Infections in Afebrile Infants Diagnosed With Acute Otitis Media
Afebrile infants ≤ 90 days diagnosed with AOM do not seem to be at risk for IBI.
https://doi.org/10.1542/peds.2020-1571

Effect of Vasopressin and Methylprednisolone vs Placebo on Return of Spontaneous Circulation in Patients With In-Hospital Cardiac Arrest
An IHCA protocol incorporating vasopressin and methylprednisolone improved immediate outcomes, but not hospital discharge.
https://doi.org/10.1001/jama.2021.16628

Risk for Recurrent Venous Thromboembolism in Patients With Subsegmental Pulmonary Embolism Managed Without Anticoagulation
Non-trivial rates of recurrent VTE, particularly in the elderly and those with multiple SSPE, mean anticoagulation is likely indicated.
https://doi.org/10.7326/M21-2981

Effect of a Diagnostic Strategy Using an Elevated and Age-Adjusted D-Dimer Threshold on Thromboembolic Events in Emergency Department Patients With Suspected Pulmonary Embolism
Another successful example of adjusting D-dimer thresholds, this time combining pretest likelihood and age.
https://doi.org/10.1001/jama.2021.20750

Outpatient Management of Patients Following Diagnosis of Acute Pulmonary Embolism
Of the few low-risk patients with PE managed as outpatients in the U.S., the subsequent hospitalization rate was around 10%.
https://doi.org/10.1111/acem.14181

Rapid Administration of Methoxyflurane to Patients in the Emergency Department (RAMPED) Study: A Randomized Controlled Trial of Methoxyflurane Versus Standard Care
More patients treated with methoxyflurane had reductions in pain, but more patients in the methoxyflurane arm received oral and/or parenteral opioids.
https://doi.org/10.1111/acem.14144

Repeat head computed tomography for anticoagulated patients with an initial negative scan is not cost-effective
Only 1% of patients on anticoagulation with an initial negative head CT developed subsequent ICH, none of whom developed symptoms or required intervention.
https://doi.org/10.1016/j.surg.2021.02.024

Risk of Traumatic Brain Injuries in Infants Younger than 3 Months With Minor Blunt Head Trauma
2+% of infants aged less than 3 months meeting PECARN low-risk criteria still had ICH, although only 1 – 0.2% – was clinically important.
https://doi.org/10.1016/j.annemergmed.2021.04.015

Impact of oral corticosteroids on respiratory outcomes in acute preschool wheeze: a randomised clinical trial
Prednisolone hastens improvement in wheezing and reduced hospital admission, while symptoms were equivalent by 24 hours, regardless.
https://doi.org/10.1136/archdischild-2020-318971

Association of Intravenous Radiocontrast With Kidney Function
An interesting analysis centered around the dichotomous D-dimer cut-off for CTPA found no association of contrast exposure with follow-up eGFR.
https://doi.org/10.1001/jamainternmed.2021.0916

Maximizing the Morning Commute: A Randomized Trial Assessing the Effect of Driving on Podcast Knowledge Acquisition and Retention
Similar knowledge retention resulted from podcast listening whether attention was focused or during driving.
https://doi.org/10.1016/j.annemergmed.2021.02.030

Intermediate-Value CTCA?

Pervasive use of CT coronary angiography has been an unnecessary feature of the evaluation of patient with low-risk chest pain for the better part of a decade now. The argument behind its use – a normal examination confers a durable protective effect – is obviously nonsensical, as this bestows agency upon the test itself. Obviously, in a low-risk population with rare adverse outcomes, there can be no reasonable expectation of value in testing.

The sensible idea, then, is to use CTCA in those patients at intermediate risk. In this trial, the stratification used was GRACE score, and the 1,748 participants in this trial were a mean of 62 years of age, and a GRACE score of 115 (SD ± 35). Patients were eligible by symptoms of an acute coronary syndrome, supported by ECG changes, an elevated troponin, or a history of ischemic heart disease. Patients were then were randomized to receive CTCA in the ED or “standard of care only”. The primary outcome was, naturally, the glorious typical cardiology trial outcome of death or non-fatal myocardial infarction at one year.

Over half of patients included demonstrated troponin levels exceeding the 99th percentile, nearly two-thirds had an abnormal ECG, and a third had known coronary artery disease. Approximately a quarter had previously undergone angiography, with a number also receiving PCI. The vast majority presented with chest pain as their initial complaint.

Most patients randomized to CTCA underwent CTCA; a small number of those randomized to standard care also underwent CTCA within 30 days, as well. About a quarter of patients in this cohort demonstrated normal coronary arteries – a fairly surprising development considering the combination of age, risk factors, elevated troponin, and abnormal electrocardiogram necessary for inclusion. Most patients with normal coronary arteries were predictably managed by medical means alone. The remaining patients demonstrated either non-obstructive coronary disease or obstructive coronary artery disease, with concordant trends towards subsequent invasive coronary angiography.

However, after all of that, even with the added information provided by CTCA, there was no difference in mortality or non-fatal myocardial infarction at one year. Delving into the complexities of subsequent resource utilization, it was noted patients undergoing CTCA were less likely to ultimately undergo invasive coronary angiography, 54.0% vs 60.8%. Similarly, patients with the initial CTCA were less likely to undergo subsequent non-invasive testing, 19.4% vs. 26.2%. Other differences in medical or preventive management did not differ by study arm.

So, a small decrease in invasive testing counterbalanced by the large baseline investment in non-invasive testing – without any clear patient-oriented benefit on health outcomes. CTCA certainly has a role in the evaluation of patients with chest pain and possible CAD, but certainly not as a routine investigation in the ED.

“Early computed tomography coronary angiography in patients with suspected acute coronary syndrome: randomised controlled trial”
https://www.bmj.com/content/374/bmj.n2106

Why Isn’t tPA in Minor Stroke Questioned?

A couple months back, this little report – MaRISS – was published with minimal fanfare in Stroke. Considering the effort necessary to fund and conduct a prospective study, it’s rather remarkable these data are so uninformative.

The stated purpose of this study:

“The objective of this study is to describe multidimensional outcomes, identify predictors of worse outcomes, and explore the effect of thrombolysis in this population.”

Reading between the lines – and considering the study and virtually every author here are sponsored by Genentech – the hoped-for outcome was likely some observational support for the pervasive practice of treating mild stroke with alteplase. Considering all the bias of their study design, it’s actually rather surprising they were unable to do so.

To be included in MaRISS, patients with mild stroke were approached after initial treatment, within 24 hours of hospital admission. However, it is grossly obvious the vast majority of patients meeting eligibility criteria were not even approached. Their “CONSORT diagram” doesn’t actually describe their study population prior to the “consented” step of the process – meaning it only describes those patients dropping out or excluded subsequent to consent. How many patients with mild stroke were admitted to participating hospitals during the study period? How many patients were approached, but declined participation? This information is conspicuously and irresponsibly absent.

The resulting convenience sample, then, ultimately reflects the selection biases of those enrolling. For example, out of 1,765 patients included, only 3 (0.3%) developed symptomatic intracranial hemorrhage. This clearly indicates these data are flawed, as the PRISMS trial demonstrated a 3.3% rate of sICH, and even the Get With the Guidelines-Stroke registry of minor stroke shows a 1.8% rate of sICH. The authors provide the understated: “it is possible that individuals with early complication from thrombolytic treatment were not enrolled.”

Sometimes, possibilities are near certainties – and this is one of those cases.

Regardless, the authors then attempt to discern a beneficial effect of alteplase by comparing their treated (57%) and untreated (43%) final study population. Again, the bias of these authors is quite clear because they create eight different adjustment models and use mRS, Barthel Index, European Quality of Life 5 Dimensions, a Visual Analogue Scale version of stroke assessment, and the Stroke Impact Scale to create an 8 x 5 grid of tests for alteplase to display its superiority. In only one of these boxes was their model able to shake out a benefit for alteplase – and, of course, this chance finding gets escalated into the abstract with “a suggestion of efficacy was noted in the NIHSS 3–5 subgroup.” Nor was any effect on outcomes from time-to-treatment with alteplase identified.

So, an observational trial unable to obtain a representative sample nor describe a hoped-for treatment effect. What little remains is a page and a half of mostly previously-described associations of clinical features with poor functional outcomes, fractionally moving the science forward. If anything, these data ought to enhance calls for better prospective clinical trials versus placebo in minor stroke – if anyone weren’t already entrenched in their clinical opinions.

“Predictors of Outcomes in Patients With Mild Ischemic Stroke Symptoms”

https://www.ahajournals.org/doi/10.1161/STROKEAHA.120.032809