Abstract:To address the challenge of assisting in the detection of multiple types of traumas, including fractures, soft tissue swelling, and bone lesions in X-ray images, a target detection algorithm model based on deep convolutional neural networks WristXNet is proposed. Firstly, a multi-scale attention feature aggregation module C2f_MSAF was designed to enhance the model′s ability to understand features of multi-scale targets. Secondly, a hybrid pooling spatial pyramid module HPSP was constructed to improve the extraction of correlated features among different target categories. Subsequently, a dynamic upsampling module DySample was introduced to further enhance the capture of fine-grained features. Finally, a lightweight detection head with a decoupled structure LDDHead was developed to improve computational efficiency. Experimental results on the publicly available pediatric wrist trauma X-ray dataset GRAZPEDWRI-DX, demonstrate that the proposed algorithm achieves the highest mean average precision (mAP) of 68.5% across seven common target categories in X-ray images, surpassing the current state-of-the-art algorithm by 1.6%. Additionally, the model size is only 3.3 M, and it achieves a processing efficiency of 156.9 images per second, demonstrating excellent overall performance.