An artificial intelligence chatbot that generates humanlike responses passed all three parts of the U.S. Medical Licensing Exam, according to findings published in the preprint server medRxiv.
Researchers evaluated the performance of ChatGPT — a model launched by OpenAI in November — on the exam. For Part 1 of the comprehensive exam, second-year medical students typically spend 300 to 400 hours preparing. It covers didactic and problem-based learning, including basic science, pharmacology and pathophysiology. The final part is completed by post-graduate students.
Researchers found ChatGPT "performed at or near the passing threshold for all three exams without any specialized training or reinforcement," according to the findings published Dec. 21. While it varies by year, the USMLE pass threshold is approximately 60 percent most years, study authors noted. ChatGPT performed above 50 percent accuracy across all examinations and exceeded 60 percent in most analyses.
"These results suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making," the researchers said. The study included a number of limitations and is awaiting peer review.
View the full report here.