Volume IV, Number 1 | Spring 2025

Artificial Intelligence vs. Physician Expertise in Appendicular Skeleton Fracture Detection: A Scoping Review

Christoforides E, Mulroy D, Oswald S, Rust B, Jacobs R, Muralidhar R
Nova Southeastern University, Fort Lauderdale, FL, USA

Introduction
Fractures, arising from diverse medical conditions, occur as closed or open types, with limb fractures mainly due to trauma. Artificial Intelligence (AI), employing machine learning and convolutional neural networks, significantly aids fracture detection in radiology, improving accuracy compared to assessments by non-experts or resident physicians. This scoping review evaluated AI’s effectiveness in fracture detection versus physician diagnosis, focusing on appendicular skeletal regions, including its role in clinical care enhancement and minimizing diagnostic disparities in trauma cases.

Methods
The researchers systematically and rigorously selected original, peer-reviewed English articles from U.S.-based research between January 1, 2018, and December 31, 2022. Inclusion criteria encompassed AI modalities, radiographic images of pediatric or adult fractures, and involvement of orthopedic surgeons and radiologists for diagnostic comparison. Exclusion criteria removed background articles, studies solely predicting AI efficacy, and those not comparing physician and AI fracture detection. The analysis focused on the appendicular skeleton, refining samples to ensure direct statistical comparison. Search efforts yielded 754 studies; with 36 were selected for the final review using the PRISMA method. The research question, formulated through the Population, Concept, and Context strategy, explored AI’s diagnostic accuracy compared to physicians. Critical appraisal utilizing Joanna Briggs Institute tools ensured the inclusion of articles with low risk of bias.

Results
Seven articles highlighted superior fracture detection, while ten reported comparable performance to physicians, and one suggested physicians might outperform AI. Additionally, eleven studies using a deep learning program assistant improved physicians’ fracture detection abilities, and eight employed a unique comparison method. Variations in expertise and diagnostic capabilities were observed among different physician specialties. Expertise, years of training, and exposure significantly influenced diagnostic outcomes, with experienced orthopedic specialists consistently achieving the highest Area Under Curve (AUC) values. Radiologists, regardless of experience, exhibited consistent proficiency. Multidisciplinary collaboration holds the best diagnostic outcomes for optimizing fracture detection and enhancing patient care.

Conclusions
The articles demonstrate AI’s superiority over inexperienced physicians and its comparable performance to specialists in fracture diagnosis. Implementing AI in medical workflows could enhance diagnostic accuracy and efficiency across all levels of physician experience, addressing litigation concerns and improving patient outcomes related to missed or delayed fracture diagnosis. Well-trained deep learning models offer advantages such as reduced reading times, fewer missed diagnoses, decreased over-ordering of advanced imaging, and lower rates of false positives, particularly beneficial in facilities with limited specialist staffing and imaging resources. While current AI models have limitations, ongoing research suggests a promising future for widespread adoption in medical diagnostics.

The Journal of the American Osteopathic Academy of Orthopedics

Steven J. Heithoff, DO, MBA, FAOAO
Editor-in-Chief

To submit an article to JAOAO

Share this content on social media!

Share this content on Facebook
Share this content on LinkedIn
Authors in this Edition

© AOAO. All copyrights of published material within the JAOAO are reserved.   No part of this publication can be reproduced or transmitted in any way without the permission in writing from the JAOAO and AOAO.  Permission can be requested by contacting Joye Stewart at [email protected].