Can AI Reduce Malpractice Risk?

Findings from an Emergency Medicine Case Study

Emergency medicine is one of the most demanding specialties within healthcare, offering immense rewards but also carrying significant risk. Over the past five years, malpractice claims against emergency medicine physicians have risen by more than one-third, while other specialties have experienced a reduction in claims. According to a study published in the National Institutes of Health library, approximately 75% of emergency physicians will face a malpractice suit during their careers.

Common allegations against emergency medicine physicians are typically related to diagnosis. The complexity and uncertainty inherent in this field can lead to errors in clinical judgment, premature closure of differential diagnoses and inappropriate anchoring on incorrect diagnoses.

To address these diagnostic challenges, MagMutual conducted an informal study to explore whether ChatGPT — an artificial intelligence (AI) tool — could enhance diagnostic accuracy and reduce malpractice risk for emergency physicians. Despite the study's limitations, the findings suggest that AI could potentially be an effective tool in improving diagnostic outcomes and decreasing malpractice claims.

Evaluating AI’s Effectiveness

ChatGPT is a natural-language processing system designed to understand and generate human-like text and respond to a wide variety of prompts and questions. While ChatGPT represents a significant improvement over older AI applications, it has certain drawbacks, including:

  • Inability to learn from interactions
  • Inability to understand and respond to emotions
  • Potentially biased or inappropriate outputs

Despite these limitations, the widespread availability and use of ChatGPT prompted the MagMutual team to test its effectiveness in the context of emergency medicine claims. Based on a national database at MagMutual, a de-identified retrospective cohort design was used to analyze closed malpractice claims. The criteria for selecting claims were:

  • Closed between 2020 and 2023
  • Allegation of “delay” or “failure to diagnose”

Claims involving non-emergency physicians or other allegations were excluded. For each selected claim, two key data-points were entered into ChatGPT:

  • Emergency Medical System (EMS) notes, nursing triage notes, vital signs, and the history and physical exam of the defendant provider
  • Diagnostic studies recorded in the encounter

ChatGPT was prompted to provide a preliminary differential diagnosis, a most likely diagnosis and recommended diagnostic studies for each data point. After receiving the second data-point, ChatGPT was asked to determine remaining differential considerations, a likely diagnosis, and recommended disposition (admission or discharge).

ChatGPT’s responses were compared to the actions of the emergency physicians and the plaintiff outcomes to assess whether AI could have helped avoid diagnostic errors. The standard of care (SOC) evidenced by the emergency physicians was also evaluated as an independent variable in judging ChatGPT’s effectiveness in preventing claims.i

Findings & Results

  • The findings of this informal study revealed that ChatGPT may have prevented slightly more than half (52%) of the claims examined.
  • In cases where it was effective, ChatGPT suggested appropriate diagnostic studies and dispositions.
  • In the remaining 48% of cases, ChatGPT's suggestions would not likely have prevented the claims due to factors like rare or complex diagnoses, incomplete information or critical misses during physical exams.
  • AI shows promise in augmenting diagnostic skills and potentially reducing the risk of misdiagnosis in emergency medicine, particularly in uncertain or challenging cases.
  • AI could aid physicians in documenting their clinical reasoning and decision-making processes, potentially improving communication and the defensibility of any claim that could arise.

Limitations

AI cannot and should not replace the clinical judgment or responsibilities of physicians. There were instances noted when AI’s suggestions would likely have led to errors, such as recommending unnecessary diagnostic studies for conditions not typical for the patient's age group. Even AI systems designed to enhance diagnostics can suffer from limitations, such as low specificity and high false-positive rates.

This study has clear limitations: a small sample size, retrospective design, single-observer interpretation, and the variability of AI responses. ChatGPT was not designed for medical diagnosis and comes with its own inherent flaws and biases. Further research and validation of ChatGPT recommendations are indicated before implementation in clinical settings.

Ultimately, emergency physicians and other medical professionals should use AI tools with great caution and discretion. AI can be a valuable support tool, but it should complement, not replace, the expertise and judgment of skilled medical professionals.

MagMutual’s Learning Center offers many additional resources concerning the business, practice and regulation of medicine.

Disclaimer: The information provided by ChatGPT is intended for general informational purposes only. While efforts are made to ensure the accuracy and reliability of the information presented, neither ChatGPT nor MagMutual can guarantee its completeness, suitability or validity for any particular purpose. Users are advised to verify the information obtained from ChatGPT with other credible sources and to exercise their own judgment when applicable.

Standard of Care rankings are determined by an independent analysis from peer physicians on the MagMutual Medical Faculty panel.

Copyright © 2024 Becker's Healthcare. All Rights Reserved. Privacy Policy. Cookie Policy. Linking and Reprinting Policy.

 

Featured Whitepapers

Featured Webinars