Volume 48,Issue 16,2025 Table of Contents

Fast ground plane detection method based on homography in monocular SLAM

2025, 48(16):1-11.

Abstract (111) HTML (0) PDF 13.93 M (115) Comment (0) Favorites

Abstract:In visual simultaneous localization and mapping (SLAM), ground information not only provides a reference for gravity direction but also effectively aids in obstacle recognition, making accurate detection of the ground plane crucial for robot navigation. To address the problem of ground plane estimation in monocular visual SLAM with limited computational resources and lacking depth information, this paper proposes a ground detection method based on homography. Initially, the homography matrix is computed using the RANSAC method on matched feature point pairs in the initial environment, obtaining the initial ground plane and corresponding ground point cloud. Subsequently, based on the obtained ground seed points, the homography estimation is combined with a dynamic growth strategy during the SLAM mapping process to gradually expand the ground point cloud, achieving precise segmentation of the ground plane at a low computational cost. Experimental results show that the proposed method achieves segmentation accuracy exceeding 92.52% on public datasets and local test data, with an angular error of less than 0.13° for the ground plane and a normalized plane distance error of less than 0.008, validating the effectiveness of the method. Additionally, the proposed algorithm only increases computational cost by 4.57%, meeting real-time operation requirements.

Parameter adaptive NC-AFM cantilever control system design

Zeng Dechao , Yao Zhifei , Sun Qiuyuan , Zhao Bohui , Ma Zongmin

2025, 48(16):12-18.

Abstract (74) HTML (0) PDF 5.75 M (100) Comment (0) Favorites

Abstract:The imaging accuracy of a non-contact atomic force microscope heavily depends on the performance of its resonant frequency demodulation and feedback loop, which maintain the constant amplitude oscillation of the cantilever beam. To enhance this performance, this paper presents a parametric adaptive cantilever control system. It does this by improving the structure of the amplitude proportional-integral controller and the traditional phase-locked loop, and by introducing the single-neuron PID algorithm and the least-mean-square algorithm, respectively, to achieve adaptive adjustment of the system′s key parameters. Through experimental verification and system testing, the technique effectively achieves stable control of the micro-cantilever beam, reduces the phase-locked loop′s frequency locking time from 41 ms to 32 μs, and improves the frequency resolution to 0.04 Hz. Concurrently, The system effectively suppresses cantilever oscillations, significantly improving nonlinear distortion in imaging experiments. Finally, a coating surface test was conducted, accurately measuring a thickness of 50 nm.

Rolling bearing remaining life prediction using AE-BiLSTM with STA mechanism

Shi Tong , Zhang Zihao , Qiu Xiaohui , Zhang Wan

2025, 48(16):19-28.

Abstract (90) HTML (0) PDF 8.75 M (132) Comment (0) Favorites

Abstract:Bearing as an indispensable part of mechanical parts, long-term work is easy to lead to bearing wear and fatigue failure, and then affect the normal operation of mechanical equipment. Therefore, the prediction of the remaining useful life (RUL) of the bearing can effectively avoid accidents and ensure the safe and reliable operation of the equipment. To enhance the prediction accuracy of the RUL for rolling bearings, this paper proposes a rolling bearing life prediction method based on spatio-temporal attention (STA) and bidirectional long short-term memory (BiLSTM). Effectively integrate multiple modal information in bearing data to capture changes in bearing operating state. Firstly, the original vibration signal is input to auto-encoder (AE) model to extract fault features automatically. Then, the extracted features are input into the STA model, and the spatial information and running time step information of the feature data in the feature dimension are deeply weighted to capture the information of the feature dimension and time dimension more comprehensively. Combined with BiLSTM model, the remaining service life of bearing is predicted. Finally, experimental validation is conducted using datasets from the PHM2012 Challenge and ABLT-1A bearing full-life cycle data.The experimental results indicate that the proposed model has achieved an average reduction of approximately 22.76% in RMSE and 26.57% in MAE, while the R2has improved by an average of about 12.47%. It can be concluded that the proposed method significantly enhances the accuracy of RUL prediction.

Research on optimization algorithm for ECG feature point detection based on ResNet-LSTM network

Su Peng , Wang Shuhan , Pan Guoxin , Zhang Zhongyu , Wang Peili

2025, 48(16):29-39.

Abstract (96) HTML (0) PDF 9.93 M (136) Comment (0) Favorites

Abstract:Accurate detection of characteristic points in electrocardiogram (ECG) signals is crucial for medical rehabilitation assistance devices, cardiac monitoring systems, and cardiovascular disease research. To address the issues of missed detections and false alarms in traditional methods, this paper proposes an optimized algorithm for ECG characteristic point detection based on ResNet-LSTM-Differential Threshold. In this study, adaptive thresholding is utilized to label the characteristic points of ECG signals, followed by training the ResNet-LSTM model on the annotated ECG signal data. Finally, the differential threshold method is integrated in the decision-making phase to detect R-waves in parallel. A detection is considered a true positive if either the neural network model or the threshold method successfully identifies an R-wave. Experimental results demonstrate that the proposed method achieves an R-wave detection accuracy of 99.4% on the MIT-BIH database, outperforming both single threshold methods and traditional deep learning approaches in terms of detection precision and computational efficiency. The proposed ResNet-LSTM-Differential Threshold method for ECG characteristic point detection effectively enhances the accuracy and robustness of detection. It enables efficient, precise, and real-time detection of characteristic points even when dealing with complex and variable ECG signals, offering broad application prospects for various medical devices and healthcare systems.

Based on IZOA combined with minimum cross-entropy image segmentation algorithm

Liu Tingting , He Zhiqin

2025, 48(16):40-53.

Abstract (102) HTML (0) PDF 25.24 M (110) Comment (0) Favorites

Abstract:To address the issues of low segmentation accuracy, low efficiency, and unstable segmentation results with increasing thresholds in color image multi-threshold segmentation, an improved multi-threshold image segmentation algorithm based on the improved zebra optimization algorithm (IZOA) is proposed. Firstly, a chaotic mapping method is used to initialize the population; secondly, a neighborhood fluctuation strategy is introduced for fine searching; then, hybridization and mutation operations are combined to generate new solutions, enhancing the global search capability of the algorithm; finally, an elite retention strategy is employed to preserve the optimal solution. The minimum symmetric cross-entropy obtained before and after image segmentation is utilized as the fitness function for multi-threshold segmentation, demonstrating higher segmentation accuracy, efficiency, and stability. Experimental results show that compared with ZOA, GWO, WOA, and other algorithms, the image quality indices FSIM, SSIM, and PSNR achieved by the IZOA-based segmentation exhibit significant advantages, with the optimal truncation mean proportions reaching 91.7%, 88.9% and 100%, respectively.

Multimodal automatic sleep staging based on KAN

Zhang Changtao , Geng Duyan , Yin Yue

2025, 48(16):54-59.

Abstract (104) HTML (0) PDF 3.62 M (114) Comment (0) Favorites

Abstract:The current automatic sleep staging model has the problems of insufficient feature extraction ability and poor multimodal feature fusion effect. In order to deal with nonlinear signals more effectively, the Kolmogorov-Arnold networks (KAN) is used to dynamically learn nonlinear activation functions, and the feature extraction network based on KAN and transfer learning is used to extract the features of EEG and ECG signals in sleep state respectively. The external attention mechanism is used to apply attention to different modalities respectively, and the multi-modal gated fusion scheme combined with the external attention mechanism is used for feature integration to alleviate the influence of data class imbalance on N1 stage accuracy. On the ISRUC-S3 dataset, we achieve an overall accuracy of 85.6%, a macro-average F1 value of 84.9%, and an F1 score of 67.7% for N1 stage. Compared with other advanced methods, the performance of the automatic sleep staging algorithm is effectively improved.

Semantic visual SLAM algorithm with sparse optical flow in dynamic scenarios

Hou Yuxin , Jie Jing , Hou Beiping , Zheng Hui , Yu Aihua

2025, 48(16):60-69.

Abstract (86) HTML (0) PDF 18.95 M (119) Comment (0) Favorites

Abstract:Aiming at the problems of feature point matching accuracy degradation and map construction error increase caused by dynamic interference in visual SLAM in complex dynamic scenes, a dynamic visual SLAM algorithm combining semantic segmentation and sparse optical flow is proposed. Firstly, an adaptive thresholding strategy is introduced to effectively improve the algorithm′s ability to acquire feature points in complex environments. Secondly, the DY-Conv module is embedded into the U-Net semantic segmentation network and combined with the LK sparse optical flow field to achieve accurate detection and segmentation of dynamic objects, which effectively improves the feature matching accuracy and robustness of visual SLAM in dynamic scenes. Finally, the validity of the algorithm is verified based on the TUM dataset and real scenes. Experimental results show that the improved U-Net algorithm increases the average segmentation accuracy from 92.1% of the original algorithm to 94.5%. Meanwhile, the proposed semantic visual SLAM algorithm improves image processing speed by 60.13% compared to ORB-SLAM3, and enhances pose estimation accuracy by 43.75%, 77.33% and 64.00% on three high-dynamic sequence public datasets, respectively. Additionally, the dense 3D point cloud maps generated based on the TUM dataset and real-world scenarios further demonstrate that the proposed algorithm can effectively suppress the interference of dynamic factors, thereby improving the accuracy of map construction.

Subway foreign body detection based on improved YOLO algorithm

He Saisai , Huang Min , Wang Wensheng

2025, 48(16):70-77.

Abstract (103) HTML (0) PDF 7.98 M (159) Comment (0) Favorites

Abstract:In order to solve the problems of low detection accuracy under low light, insufficient spatial position accuracy and small target detection accuracy of existing deep learning algorithms, a deep learning method SSS-YOLO to improve the detection of foreign objects in subway cracks was proposed to improve YOLOv10 for the detection of foreign objects in subway cracks. In order to improve the image quality in the dark environment of subway gaps, considering the weights of some features at different scales, the parameter-free attention mechanism is introduced into the SSS-YOLO model, and the spatial position information is strengthened to reduce the amount of information loss, and finally the Shape-IOU loss function is used to enhance the accuracy of small target detection and regression prediction frame, and improve the detection accuracy of small and small targets in the gap. The experimental results show that the accuracy of the method reaches 90.90%, and the average detection accuracy is increased by 3.62%.

Influence of geometrical angle of welding zone on phased array inspection of pressure vessel

Ma Chaoyang , Liang Huanglang , Xie Zhidong , Li Bin , Li Wentao

2025, 48(16):78-87.

Abstract (102) HTML (0) PDF 11.55 M (105) Comment (0) Favorites

Abstract:To address the issue of signal attenuation caused by geometric shadowing effects in PAUT of internal porosity defects in pressure vessel mitered pipe weld zone. In this paper, by constructing a structural geometry tensor angle-acoustic beam propagation attenuation model, combined with a sound range compensation algorithm. The propagation characteristics of ultrasonic waves and defect detection effects under four sample bend angles of 45°, 60°, 90° and 135° are systematically investigated. The study employed an FDTD algorithm to establish a dynamic response simulation model of the mitered pipe weld and validated the findings through experiments using aluminum alloy specimens with pre-drilled defects. The results indicate that the signal amplitude of the 45° defect increased from 50.4 to 97.4, and the SNR improved from 6.27 dB to 11.99 dB. Similarly, for the 60° defect, the signal amplitude increased from 77.5 to 97.5, and the SNR improved from 9.00 dB to 12.00 dB. The calculated deviation in SNR was less than 1.5 dB, and the defect detection rate increased by 23%. The study confirmed that the occlusion effect caused by the bend angle significantly affects the sound beam propagation path and signal amplitude. The proposed model and correction algorithm can effectively compensate for detection errors in non-standard structures, providing a quantitative theoretical basis for optimizing ultrasonic testing procedures for the pressure vessel weld zone.

Short-term wind power prediction based on parameter optimization and QR

Pu Xiaoyun , Yang Jing , Yang Xing , Ning Yuan

2025, 48(16):88-98.

Abstract (99) HTML (0) PDF 17.14 M (111) Comment (0) Favorites

Abstract:Aiming at the problem that wind power prediction models under high-fluctuation scenarios struggle to balance point-value accuracy and interval reliability, a hybrid prediction model integrating parameter optimization and nonlinear quantile regression is proposed. First, a combined TCN-GRU-DA prediction model based on a dual attention mechanism is constructed, using feature attention to mine the spatial correlation of multidimensional meteorological features and it with combining multi-head attention to capture the temporal dependence of power sequences. Second, the improve secretary bird optimization algorithm is proposed to realize the intelligent optimization of the four hyper-parameters of the combined model. This algorithm significantly enhances convergence performance by integrating good point set theory and quantum computing initialization, time-segmented nonlinear weighting, the directional search mechanism of the northern goshawk optimization algorithm, and a Cauchy distribution strategy to enhance global search capability. Finally, a multi-head attention-based nonlinear quantile regression model is developed, which dynamically adjusts feature weights under different quantiles through an adaptive loss function, thereby improving the accuracy of conditional quantile estimation. Experimental results demonstrate that, for point prediction, the proposed model reduces MAE and RMSE by 33.33% and 31.93%, respectively, compared to TCN-GRU. For interval prediction, at a 95% confidence level, the PICP improves by 3.97% and PINAW decreases by 20.76%. The study confirms that the proposed model effectively addresses the synergistic optimization of point estimation and interval estimation for wind power prediction. It not only enhances prediction robustness under extreme weather but also provides multi-dimensional decision support for day-ahead scheduling and real-time control in power grids.

Research progress on non-destructive testing technology of rigid ceramic insulation tiles

Zhou Wei , Hu Jiamei , Wang Sainan , Liu Wugang , Hou Chuantao

2025, 48(16):99-112.

Abstract (86) HTML (0) PDF 11.41 M (103) Comment (0) Favorites

Abstract:Rigid ceramic thermal insulation tiles, as critical components of thermal protection systems in aerospace vehicles, have been extensively utilized in high-temperature parts such as windward surfaces of aircraft due to their superior properties including exceptional high-temperature resistance, low thermal conductivity, and excellent chemical stability. During the manufacturing, installation and service phases, surface, internal and bonding defects may occur due to the influence of fatigue and external impact loads, which can severely compromise their thermal protection performance and even endanger the safety of aerospace vehicles. Therefore, the reliable and effective non-destructive testing of the insulation tiles is critical to ensuring structural stability, reducing maintenance costs, and enhancing the safety and service life of aerospace vehicles. This paper reviews recent research advances in non-destructive testing technologies. The X-ray, ultrasonic, infrared thermal imaging, structured light and Terahertz testing technologies are summarized respectively, and the technical characteristics and application of each method are discussed, aiming to provide technical support for the development of non-destructive testing in spacecraft thermal protection structures.

Malicious code detection model based on SecureViT

Zhang Ao , Liu Wei , Liu Yang , Li Bo , Liu Fangfei

2025, 48(16):113-121.

Abstract (85) HTML (0) PDF 5.13 M (110) Comment (0) Favorites

Abstract:With the increasing diversity and concealment of malicious code, traditional detection methods often face high costs and instability when dealing with unknown malware. This study aims to propose a lightweight and efficient malware detection model to meet the application requirements in resource-constrained environments. This paper proposes a lightweight malware detection model based on SecureViT. The model achieves efficient feature extraction and accurate classification by introducing the ACF module and MSDC module. The ACF module enhances the model′s ability to model global context information, while the MSDC module further improves the richness of feature representation through multi-scale feature extraction and dynamic significance adjustment. Experimental results show that the SecureViT model achieves classification accuracies of 97.46%, 91.17%, and 95.49% on the Malimg, Virus-MNIST, and BIG2015 datasets, respectively, with a computational cost of only 1.71 GMAC, significantly improving detection performance and effectively reducing computational costs. This model demonstrates excellent detection accuracy and low computational complexity, making it highly applicable in resource-constrained environments.

Visual SLAM approach based on depth constraints and optical flow tracking

Yin Xianbo , Wang Zhongyuan

2025, 48(16):122-131.

Abstract (87) HTML (0) PDF 14.32 M (106) Comment (0) Favorites

Abstract:Simultaneous localization and mapping (SLAM) is the key to autonomous robot navigation. However, traditional SLAM systems are typically designed for static environments, when dynamic objects are present, dynamic feature points can lead to incorrect data associations, reducing accuracy and reliability. Existing solutions still face challenges such as undetected potentially dynamic objects and an insufficient number of useful feature points when dynamic objects dominate the scene. To overcome these limitations, this study proposes a vision SLAM system based on ORB-SLAM2. Firstly, yolov8 object detection is utilized to provide semantic information, which is combined with depth information for depth constraints to generate dynamic masks; next, a quadtree-based uniform allocation of feature points is implemented based on dynamic probability, ensuring the removal of dynamic feature points while preserving more useful features; finally, optical flow tracking is utilized to detect and reject feature points on potentially dynamic objects. In which the dynamic mask is combined with keyframes to realize motion segmentation, thus constructing clean and dense point cloud maps. Experimental results on the TUM and Bonn datasets demonstrate that, compared to ORB-SLAM2, the average localization accuracy improves by over 90% in highly dynamic scenes while maintaining reliable performance in relatively static environments. Additionally, the improved system achieves real-time performance and outperforms other state-of-the-art methods in its category.

Hybrid SAM improves the hole transport capability of perovskite solar cells

Zhang Lin , Guan Xuefeng , Fang Xing , Lin Menghao , Lin Jie

2025, 48(16):132-141.

Abstract (113) HTML (0) PDF 13.79 M (109) Comment (0) Favorites

Abstract:This study proposes a hybrid SAM interface engineering strategy to address the issue of interface hole transport potential barriers caused by HOMO level mismatch in MeO-2PACz self-assembled monolayers in trans perovskite solar cells. By combining MeO-2PACz with Me-4PACz, which has a larger dipole moment, in a specific ratio, the energy level arrangement and defect passivation ability of the nickel oxide hole transport layer are optimized. Experiments have shown that when the volume fraction of Me-4PACz is 10%, the M-SAM/NiOx composite layer can significantly improve the interface charge extraction efficiency and induce the formation of a dense crystal structure in perovskite films. Based on this, the p-i-n structured PSCs prepared achieved an open circuit voltage of 1.079 V, a short-circuit current density of 24.23 mA/cm2, and a fill factor of 0.79. The photoelectric conversion efficiency increased from 18.7% to 20.76%.

Research on frequency locking and error feedback technology of resonant optical microcavity gyro

Li Yifan , Bai Yu , Zhang Shize , Bu Han , Liu Wenyao

2025, 48(16):142-149.

Abstract (83) HTML (0) PDF 7.25 M (114) Comment (0) Favorites

Abstract:Aiming at the problem of resonant frequency drift of Resonant Micro-optical Gyros (RMOG) due to factors such as ambient temperature and vibration, this paper proposes a high-precision digital frequency locking and error feedback technology research program based on FPGA. The system realizes dual-phase modulation and frequency difference signal processing through FPGA, combines 20-bit digital-to-analog converter AD5791 to generate high-precision feedback signals, uses PI controller to complete real-time tracking and locking of laser frequency to the resonant cavity of the optical waveguide, and introduces real-time output compensation algorithm based on the error feedback to dynamically correct the output loop deviation. The experimental results show that the response time of the frequency locking system is 17.5 ms, and the frequency locking accuracy reaches 48.51 Hz, which significantly improves the dynamic performance and stability of the gyro system. Compared with the traditional 16-bit DAC scheme, the new system improves the response speed and frequency locking accuracy by 62.3% by 56.0%, which verifies the effectiveness of the digital architecture and the dual-phase modulation technique in suppressing the noise and optimizing the spectral separation, and provides reliable technical support for the practical application of the resonant optical microcavity gyro.

SAR multi-scale road detection based on dense dilated pyramid

Zhang Hui , Mou Liqiang , Qin Yi , Cui Zongyong

2025, 48(16):150-157.

Abstract (109) HTML (0) PDF 7.97 M (108) Comment (0) Favorites

Abstract:Road detection in Synthetic Aperture Radar (SAR) images enables precise identification of multi-scale road targets under complex backgrounds, playing a critical role in military and civilian applications such as battlefield surveillance, target localization, and disaster response. Compared to traditional methods relying on edge detection or region segmentation, Convolutional Neural Network (CNN)-based approaches exhibit superior feature extraction and segmentation accuracy. However, existing methods still struggle with multi-scale road detection due to the diverse resolutions and varying receptive fields required for roads of different scales in SAR datasets. To address these challenges, this paper proposes a multi-scale road detection method based on a Dense Dilated Pyramid Network. The method integrates dense connections into a U-Net architecture, replacing traditional fixed-dilation-rate structures with progressive dilation rates to construct a dense dilated pyramid module in the encoder. This design progressively expands the receptive field to adapt to multi-resolution road features. Additionally, a multi-scale attention mechanism dynamically fuses shallow details and deep semantic information while suppressing background interference. Experiments on Gaofen-3 SAR datasets demonstrate that the proposed method achieves mean Intersection over Union values of 74.39%, 68.01% and 66.32% at 1 m, 3 m and 10 m resolutions, respectively, outperforming state-of-the-art methods by 2.04%~13.7%. The method significantly reduces missed detections of small-scale roads and lowers false alarms caused by environmental interference, achieving optimal detection performance across multi-scale scenarios in both single-image and cross-resolution settings.

Vehicle detection based on boundary and multi-scale feature

Li Tianlin , An Yi , Chen Yan

2025, 48(16):158-171.

Abstract (103) HTML (0) PDF 18.64 M (123) Comment (0) Favorites

Abstract:Vehicle target detection is crucial for intelligent driving, intelligent transportation, and public safety. However, challenges like background interference, small targets, and vehicle occlusion in dense traffic affect detection accuracy. For these problems, we propose EM-YOLO, which improves YOLOv8 by fusing boundary features and multi-scale features. First, we design a boundary-guided multi-scale feature block. It combines boundary and multi-scale features to improve the backbone network and enhance its ability to suppress background interference. Second, features lose details information as they flow through the network. Small vehicles extract fewer effective features, which worsens this issue. we propose a feature enhancement block that combines features from different layers to reduce detail loss and improve small target detection. Then, we analyze the performance drop caused by occlusion in dense vehicles and propose a detection head to address this issue. Finally, WPFIoU is constructed by combining PIoU, Focaler-IoU, and WIoU. It optimizes the training process and improves detection performance. Experimental results show that the improved model achieved a 1.9% increase in precision and a 4.1% increase in recall compared to the original model. The mAP50 and mAP50∶95 improved by 4.4% and 3.3%. Compared with other advanced methods, the proposed method outperforms in all performance metrics and has significant practical application value.

Sub-pixel magnetic tile edge detection method based on improved LoG-Zernike moment

Zhang Chen , Shan Wentao , Xu Cheng

2025, 48(16):172-179.

Abstract (80) HTML (0) PDF 6.80 M (101) Comment (0) Favorites

Abstract:Addressing the challenges associated with complex detection methods and the difficulties in ensuring accuracy when measuring key dimensions such as the axial length and chord length of magnetic tiles, this study proposes a sub-pixel magnetic tile edge detection method that enhances the traditional LoG-Zernike moment approach. Initially, the collected images undergo preprocessing, followed by the application of adaptive median filtering to optimize the conventional LoG operator, thereby achieving pixel-level coarse positioning through filtering and denoising. Subsequently, the Zernike template is employed to calculate the edge threshold, with the optimal step threshold determined using the two-dimensional Otsu algorithm to identify the sub-pixel points along the edge. Finally, the least squares method is utilized to fit the edge of the magnetic tile. Experimental results indicate that the relative error rates for the magnetic tile shaft length and chord length are 0.060% and 0.018%, The precision of the errors is kept within ±0.01 mm and ±0.004 mm, respectively, while the average time taken for detection per magnetic tile is 1.56 seconds.The effectiveness and practicality of the method have been confirmed.

Combining multi-scale attention and hybrid pooling for wrist trauma X-ray image detection

Lin Shujuan , Zhong Ming′en , Tan Jiawei , Fan Kang , Lin Zhiqiang

2025, 48(16):180-188.

Abstract (90) HTML (0) PDF 13.13 M (105) Comment (0) Favorites

Abstract:To address the challenge of assisting in the detection of multiple types of traumas, including fractures, soft tissue swelling, and bone lesions in X-ray images, a target detection algorithm model based on deep convolutional neural networks WristXNet is proposed. Firstly, a multi-scale attention feature aggregation module C2f_MSAF was designed to enhance the model′s ability to understand features of multi-scale targets. Secondly, a hybrid pooling spatial pyramid module HPSP was constructed to improve the extraction of correlated features among different target categories. Subsequently, a dynamic upsampling module DySample was introduced to further enhance the capture of fine-grained features. Finally, a lightweight detection head with a decoupled structure LDDHead was developed to improve computational efficiency. Experimental results on the publicly available pediatric wrist trauma X-ray dataset GRAZPEDWRI-DX, demonstrate that the proposed algorithm achieves the highest mean average precision (mAP) of 68.5% across seven common target categories in X-ray images, surpassing the current state-of-the-art algorithm by 1.6%. Additionally, the model size is only 3.3 M, and it achieves a processing efficiency of 156.9 images per second, demonstrating excellent overall performance.

PSSN-YOLO: A surface defect detection model for wind turbines

Shan Sirui , Yao Xiaomin , Chen Manlong , Chai Yujiao , Li Wei

2025, 48(16):189-196.

Abstract (85) HTML (0) PDF 8.60 M (107) Comment (0) Favorites

Abstract:As a core component of clean energy systems, wind turbines are prone to various surface defects that severely impact operational efficiency and safety, making timely detection and treatment crucial. To address issues such as missed detections, false alarms, and insufficient accuracy in small target detection, this paper proposes an improved YOLOv8-based algorithm for wind turbine surface defect detection: PSSN-YOLO. The algorithm introduces a small target detection layer to provide multi-scale information, employs the Slim-neck paradigm as the feature fusion network to enhance detection accuracy while reducing model parameters, embeds the SE attention mechanism before each detection head to focus on critical feature channels and improve defect detection in complex environments, and optimizes the loss function using normalized NWD distance to better measure bounding box similarity. Experimental results demonstrate that the improved algorithm achieves increases of 1.1%, 4.4%, and 2.6% in precision (P), recall (R), and mAP50, respectively, while reducing parameter count by 8.97%, better satisfying the practical requirements for wind turbine surface defect detection.

Home

Introduction

Editorial Committee

Policy

Contact Us

中文版

>Research&Design

>Theory and Algorithms

>Intelligent Instrument and Applications

>Photoelectric Detection

>Information Technology & Image Processing