While ChatGPT has shown promise in picking imaging tests and identifying diagnoses, it was decidedly less successful in recommending cancer treatments.
That's according to a new study from Boston-based Brigham and Women's Hospital researchers, who found that in a third of cases GPT-3.5's recommendations went against National Comprehensive Cancer Network treatment guidelines. The research was published Aug. 24 in JAMA Oncology.
"ChatGPT responses can sound a lot like a human and can be quite convincing. But when it comes to clinical decision-making, there are so many subtleties for every patient's unique situation," said corresponding author Danielle Bitterman, MD, a radiation oncologist at Brigham and Women's and the Artificial Intelligence in Medicine Program at Somerville, Mass.-based Mass General Brigham, in an Aug. 24 news release. "A right answer can be very nuanced, and not necessarily something ChatGPT or another large language model can provide."
The researchers asked ChatGPT to make treatment recommendations for breast, lung and prostate cancer, according to the study. While the AI chatbot provided at least one NCCN-backed recommendation in 98 percent of the cases, 34 percent of the responses did not align with the guidelines. ChatGPT "hallucinated," or made up treatments, 12.5 percent of the time.
Mass General Brigham has been at the forefront of ChatGPT research, also completing the studies on imaging and diagnosis. GPT-3.5 is the free version of ChatGPT, which also has the GPT-4 update available for a fee-based subscription.