The dangers of healthcare generative AI 'drift'

IT leaders are embracing generative AI in healthcare but also expressing concerns that the technology can "drift."

The performance of GPT-4, the large language model that powers ChatGPT, in answering healthcare questions can change over time, a phenomenon known as "drift," according to a study by researchers at Somerville, Mass.-based Mass General Brigham. Their work was published Aug. 8 in NEJM AI.

"Generative AI performed relatively well, but more improvement is needed for most use cases," said corresponding author Sandy Aronson, executive director of IT and AI solutions at Mass General Brigham Personalized Medicine, in an Aug. 13 statement. "However, as we ran our tests repeatedly, we observed a phenomenon we deemed important: running the same test dataset repeatedly produced different results."

Mr. Aronson and his fellow researchers were analyzing whether the technology could scan scientific articles to help geneticists with assessments of genetic variants. The variability of the results could differ across days, so the authors say the AI's performance needs to be continuously monitored.

Copyright © 2024 Becker's Healthcare. All Rights Reserved. Privacy Policy. Cookie Policy. Linking and Reprinting Policy.

 

Featured Whitepapers

Featured Webinars