American and Chinese datasets overrepresented in clinical AI, study says

Most datasets used in clinical artificial intelligence papers come from the U.S. and China, casting worries about representation and diversity of data and subsequently AI accuracy, according to a study published March 31 in PLOS Digital Health

Given that AI works best when trained on large datasets that are relevant to the clinical patient population, the international group of researchers wanted to understand which regions have the richest and poorest datasets. The group completed a literature review on hundreds of papers on clinical uses of artificial intelligence, finding out their data sources, author nationality, gender and speciality. 

They found that the U.S. and China account for the majority of data sources for such papers, representing 40.8 percent and 13.7 percent of papers respectively. The U.K. then Germany followed next in terms of representation at 6.7 percent and 5.7 percent. The authors of such papers are predominantly male, with men representing 74.1 percent of authors overall and over half of them were data experts. 

The skew in data toward developed nations, male authors and non-clinical specialists worried the authors, who wrote; "this bias could worsen minority marginalization and widen the chasm of healthcare inequality."

Copyright © 2024 Becker's Healthcare. All Rights Reserved. Privacy Policy. Cookie Policy. Linking and Reprinting Policy.

 

Featured Whitepapers

Featured Webinars