
Editor in chief:Prof. Sun Shenghe
Inauguration:1980
ISSN:1002-7300
CN:11-2175/TN
Domestic postal code:2-369
- Most Read
- Most Cited
- Most Downloaded
Cheng Xiaohu , Zhang Xiangfeng , Jiang Hong
2026, 49(3):1-10.
Abstract:To address the shortcomings of the traditional Aquila Optimizer (AO), such as its propensity to fall into local optima and its slow convergence in high-dimensional optimization and robot path planning, this paper proposes an improved algorithm named HATAO (Aquila Optimizer with Halton, Aerial search, and Triangular mutation). Firstly, the Halton low-discrepancy sequence is used to enhance the uniformity of the initial population distribution. Secondly, an aerial search mechanism from the Arctic Puffin Algorithm is incorporated into the contraction exploitation phase to improve the population′s cooperative evolution and search accuracy. Lastly, a triangular mutation operator is introduced to enhance the algorithm′s convergence performance in its later stages. The proposed HATAO is benchmarked against five other algorithms using the CEC2017 test suite, with statistical significance evaluated by the Wilcoxon rank-sum test. In robot path planning applications, experimental results demonstrate that HATAO achieves superior search accuracy, faster convergence, and greater stability. Specifically, compared to the original AO, HATAO reduces path lengths by approximately 4.96% in simple scenarios and 6.34% in complex scenarios, verifying its effectiveness and robustness for practical path-planning tasks.
Qi Ji , Zhang Xiaoyu , Liu Xiangbin , Guo Rong
2026, 49(3):11-21.
Abstract:To address the strong dependence on deterministic models in the control of DTP-PMSM, an OBMFC method is proposed. First, based on the ultra-local model approach, an ultra-local model considering parameter uncertainties is constructed for the DTP-PMSM in the VSD coordinate system. Then, an ESO and an LOB are designed for the current loop and the speed loop, respectively, to estimate the unknown disturbances in the ultra-local model. Finally, the estimated disturbances are compensated into the DCPC of the current loop and the NFTSMC of the speed loop. In this way, the DCPC in the current loop reduces its reliance on the mathematical model, while the speed loop enhances the robustness of speed control. Simulation results demonstrate that, compared with two other model-free control methods—PI control and ADRC in the speed loop combined with MFCPC in the current loop—the proposed method achieves speed control without overshoot and with faster response, stronger disturbance rejection under sudden load and speed changes, and lower current harmonic content. Thus, it significantly improves the robustness, speed response, and dynamic performance of the DTP-PMSM system.
Zhu Xuejin , Li Shenyang , Li Yan
2026, 49(3):22-33.
Abstract:To tackle low accuracy caused by limited generalization in few-shot Android malware family classification, this paper proposes SupProto, a dynamic prototype network driven by supervised contrastive learning. SupProto uses SupCon to refine the embedding space, improving inter-class separation and intra-class compactness, and adopts a dynamic prototype mechanism based on hierarchical clustering and silhouette coefficients to handle multimodal family structures. In terms of input and encoding design, RGB images are constructed from multi-source static features to provide unified and discriminative representations, while a DenseNet121 combined with a CBAM attention module strengthens feature extraction. Experiments on Drebin and CIC-InvesAndMal2019 show that SupProto achieves 90.59% and 85.64% accuracy in 5-way 5-shot settings, and 75.56% and 67.96% in 5-way 1-shot settings.
Shao Luchuan , Zhao Bing , Kang Xutao
2026, 49(3):34-43.
Abstract:To address the challenges of noise interference, difficult fault feature extraction, and low diagnostic accuracy in rolling bearing vibration signals, this study proposes a fault diagnosis method based on differential evolution (DE)-optimized variational mode decomposition (VMD) combined with a comprehensive model integrating convolutional neural networks (CNN)-bidirectional gated recurrent unit (BiGRU)-Attention. Following the minimum envelope entropy principle, DE is employed to optimize VMD, obtaining the optimal decomposition layer number and penalty factor. The correlation coefficients of intrinsic mode functions (IMFs) under the optimal parameter combination are calculated, and useful signals are reconstructed based on a predetermined threshold. Comparative experiments with conventional classification models demonstrate that the proposed method achieves superior performance in accuracy, precision, recall and F1 score, with a fault diagnosis accuracy of 99.17%. When comparing the diagnostic results between raw and reconstructed signals in the CNN-BiGRU-Attention model, the accuracy for raw signals is 89.58%, lower than that of the denoised signals. Finally, CNN-BiGRU-Attention was compared with the CNN-bidirectional long short-term memory (BiLSTM)-Attention model. The CNN-BiGRU-Attention model showed a 1.25% higher accuracy, a 21% reduction in graphic processing unit (GPU) usage, a 23% reduction in central processing unit (CPU) usage, and a training time that was 35 seconds faster. These experimental results can provide an effective improvement method for existing rolling bearing fault diagnosis technologies.
Shen Mingna , Wang Peixue , Meng Yongwei , Hu Jiale
2026, 49(3):44-52.
Abstract:This study presents a multi-level feature fusion approach for imbalanced network traffic anomaly detection to overcome the accuracy limitations of existing methods caused by data imbalance and insufficient feature extraction. The proposed framework first employs CGAN-SMOTE algorithm to balance data distribution, then utilizes gated recurrent units with attention mechanisms to capture long-term dependencies and extract discriminative temporal local features through adaptive weight allocation. Concurrently, bidirectional long short-term memory networks with average pooling are applied to obtain comprehensive temporal global features. These extracted temporal features are subsequently fused and processed by an enhanced convolutional neural network to learn spatial representations, significantly improving anomaly recognition capability. Experimental validation on public datasets confirms the superior detection performance of our model compared to various state-of-the-art methods.
Liu Jiaming , Xia Xingyu , Wang Junze , Zhang Zhen
2026, 49(3):53-65.
Abstract:Space-time image velocimetry is a one-dimensional time-averaged flow velocity measurement method characterized by high spatial resolution and real-time performance. However, it is susceptible to gross errors in complex scenarios and requires manual parameter tuning, limiting its environmental adaptability. To overcome this, this paper proposes a fused method combining texture orientation detection and discrimination. Based on texture enhancement and frequency domain transformation, image segmentation is used to separate valid and invalid signals in the spectrum. While detecting texture angles, the local features of the segmented signal are used for statistical discrimination, thereby reducing noise interference from erroneous angles. Parameter determination, sensitivity analysis, multi-scene comparison, flow rate ratio measurement and calibration test experiments are also carried out. Results show that with parameters optimized via large-sample statistics, the proposed method reduces mean absolute error by 58.32%, 42.94% and 29.66% compared to frequency-domain velocimetry using three different integration radii. Root mean square error is reduced by 36.90%, 22.60% and 13.56%, respectively. In velocimetry measurements at Panzhihua and Maozhouhe stations, the relative error remained within 7.88%. Calibration at Panzhihua showed a systematic error of 0.188% and random uncertainty of 4.879% in cross-sectional flow. The sign test, line fit test, and deviation test all passed, confirming the method′s accuracy and robustness in complex flow scenarios.
Yang Huimin , Gao Xiaowen , Li Ruitao , Wang Hanxia
2026, 49(3):66-76.
Abstract:To address the limited recognition capability of complex package types and fine-grained features in package defect detection, as well as the shortcomings in precision and real-time performance of existing models, this paper proposes an improved YOLOv8n-based algorithm for defect detection in express packages. First, the C2f module in the network is integrated with frequency-adaptive dilated convolution (FADC) to design the C2f-FADC module, which dynamically adjusts when handling multi-scale and multi-frequency defect detection tasks, optimizing the feature extraction process and improving the representational ability. Secondly, the SimSPPF module is introduced to replace the original SPPF module, simplifying the structure while enhancing multi-scale feature fusion capability and improving the perception of small-sized targets. Finally, the bounding box regression loss function is replaced with Shape-IoU to more accurately model the shape and scale differences between the predicted and ground-truth boxes, optimizing the detection localization performance. On a self-constructed package defect dataset, the improved algorithm achieved a detection accuracy of 96.3%, with a 4.4% in-crease in mAP50 compared to the original algorithm, and a detection speed of 98 FPS. Considering both precision and speed, the proposed method shows significant advantages over other algorithms, validating its effectiveness and superiority.
Fang Hongyu , Zhang Yuxiang , Gong Qin
2026, 49(3):77-86.
Abstract:Traditional hearing aids employ gain compensation formulas during speech amplification that are primarily designed for quiet environments, failing to meet the actual gain needs of hearing-impaired patients in diverse real-world settings. This results in patient dissatisfaction with the gain compensation and a suboptimal user experience. To address this issue, this paper proposes a GS-LGA-XGBoost algorithm that automatically adjusts the optimal gain for real-life environments. The algorithm predicts the gain at each frequency point of the hearing aid using Extreme Gradient Boosting (XGBoost), Grid Search (GS), and an improved Genetic Algorithm (LGA). A dataset comprising gain data from 1 200 ears of patients satisfied with their hearing aids, collected from actual hospitals, was used to construct three gain prediction models for soft, medium, and loud sounds. The proposed algorithm demonstrates test results on the soft, medium, and loud gain test sets that align more closely with the gain values preferred by patients. Compared to three other machine learning methods—Support Vector Regression (SVR), Random Forest (RF), and Deep Neural Network (DNN)—the proposed algorithm outperforms all of them in predicting hearing aid gain. The GS-LGA-XGBoost algorithm not only enables dynamic adjustment of hearing aid gain across different environments but also achieves high prediction accuracy, better meeting the satisfactory gain requirements of hearing-impaired patients.
Fang Tiexin , Li Xinkai , Meng Yue , Zhang Hongli
2026, 49(3):87-97.
Abstract:In order to solve the problems of high sampling and search randomness, poor environmental adaptability and unsmooth planning path of bidirectional RRT* algorithm in the process of UAV global path planning in complex environments, this paper proposes a bidirectional fast stochastic tree star path planning algorithm (FB-RRT*) that integrates step size strategy. Firstly, in order to solve the problem of high sampling randomness, a sampling strategy with target bias is set up to reduce the number of blind random samples. Then, the dynamic random step size of the fusion angle and obstacle environmental parameters is used to improve the environmental adaptability of the algorithm. Finally, in order to solve the problem of too long planned path, the path clipping and B-spline optimization strategy are combined to effectively remove the redundant turning points, so as to obtain a better path. MATLAB experimental results show that compared with the B-RRT* algorithm, the average planning time is reduced by 58% and the average path length is shortened by 11.9%, which shows that the improved FB-RRT* algorithm has efficient planning ability.
Wang Jing , Gao Yapeng , Li Haifang
2026, 49(3):98-110.
Abstract:Path planning algorithms are key to enabling mobile robot navigation. In view of the deficiencies of traditional path planning algorithms in orchard environments in terms of node traversal, search efficiency, path smoothness and obstacle avoidance ability, this paper proposes an improved path planning method combining the A* algorithm and the DWA algorithm, which effectively improves the global optimality and real-time obstacle avoidance of the planned path. Firstly, a two-dimensional raster map is constructed using three-dimensional point cloud data to provide an accurate environmental model for the navigation robot. The neighborhood search method of the traditional A* algorithm is optimized through the rectangular expansion search strategy. Combined with the selection method of critical path nodes and the path smoothing technology based on dynamic tangent circles, a global path that meets the operation requirements of orchards is generated. Optimize the evaluation function of the traditional DWA algorithm, introduce factors such as heading angle, path deviation and obstacle information, and improve the global orientation and local response ability of obstacle avoidance decisionmaking. Finally, A fusion architecture of the improved A* algorithm and the improved DWA algorithm is constructed to enable coordinated global navigation and local avoidance. The simulation results show that the improved algorithm in this paper has significant advantages in terms of path planning efficiency, path quality and obstacle avoidance, meeting the actual needs of mobile robot path planning in the orchard environment and supporting intelligent orchard management.
2026, 49(3):111-118.
Abstract:To address the accuracy degradation in far-field reconstruction caused by phase information absence during high-frequency antenna near-field measurements, this paper proposes a phase-less near-field to far-field transformation method based on dual-spherical sampling. Building upon the equivalent magnetic current theory, the proposed method achieves equivalent current reconstruction and far-field retrieval through a dual-spherical truncated amplitude flow algorithm. The implementation features three key innovations: first, row scaling preprocessing of the measurement matrix ensures algorithm convergence; second, Armijo line search optimization enhances iteration step selection efficiency for solution approximation; third, an alternating sampling mechanism between dual spherical surfaces effectively reduces measurement matrix correlation while expanding satial sampling dimensions. Simulation results demonstrate that,for antennas of different types and sizes,using only dual-spherical near-field amplitude data, the proposed method achieves precise far-field reconstruction within ±60° of the main lobe, maintaining amplitude errors below 0.5 dB.The results of the calculations are in good agreement in the actual near-field measurements.This approach provides an effective solution for phase-less near-field measurements and demonstrates practical engineering value.
Jiang Ping , Lu Qinggang , Zhang Bingli , Xu Chao , Song Zuchang
2026, 49(3):119-127.
Abstract:To address the issues of slow clamping force response and deteriorated control accuracy in Electronic Mechanical Braking systems, a composite sliding mode control strategy incorporating a disturbance observer is proposed. Firstly, a phase-based closed-loop control strategy is designed to meet the requirements of different braking stages, enabling precise regulation through a segmented control algorithm. Secondly, based on the traditional exponential reaching law, a composite reaching law is developed by introducing a symmetric Sigmoid function and a power term of the sliding surface. This design enhances the reaching speed while mitigating chattering. Moreover, a super-twisting extended state observer is constructed to estimate system disturbances and feed them back to the controller for real-time compensation. The stability of the proposed control algorithm and observer is rigorously proven using the Lyapunov method. Simulation results under emergency braking conditions demonstrate that compared with the double-power reaching law and super-twisting algorithm, the proposed method improves the response speed by 0.05 s and 0.06 s, respectively, and reduces the average steady-state error by 0.04% and 0.05%, confirming the superior performance of the proposed EMB clamping force control strategy in terms of both response speed and control precision.
Li Yibo , Yuan Jinli , Yun Zhi , Zheng Senxiao , Guo Zhitao
2026, 49(3):128-136.
Abstract:To address the challenges of multiple peaks and power fluctuations in maximum power point tracking (MPPT) for photovoltaic systems under complex dynamic environments characterized by partial shading, rapid irradiance fluctuations, and temperature variations, a novel deep reinforcement learning-based algorithm, termed DDPG-LSTM, is proposed. The algorithm integrates the continuous action space optimization capability of the Deep Deterministic Policy Gradient and the temporal feature extraction advantage of Long Short-Term Memory networks. Hierarchical reward mechanisms are designed to achieve multi-objective collaborative optimization, balancing power tracking, action smoothness, and system stability. A simulation model of the photovoltaic system is built on the MATLAB/Simulink platform, and experimental results demonstrate that under multi-peak shading and dynamic environmental conditions, the DDPG-LSTM algorithm stably escapes local optima with negligible oscillations near the maximum power point, achieving an average tracking efficiency exceeding 98%. The robustness and adaptability of the proposed method in dynamic environments are validated, providing theoretical support for the intelligent control of photovoltaic systems and the efficient utilization of renewable energy.
Liu Hao , Lu Jin , Li Peng , Li Chengxing
2026, 49(3):137-145.
Abstract:Aiming at the problem that the existing deep learning modulation recognition methods′ recognition rate is low under low SNR conditions and Insufficiently extracts and utilizes signal features, an Adaptive Wavelet and Multi-fusion Complex-value Dense Convolutional Neural Networks (AW-MCDCN) is proposed. The AW-MCDCN takes both IQ and AP signals as inputs, employing dense connections to construct a deep network that comprehensively extracts temporal features from IQ signals while incorporating AP signals to form heterogeneous feature complementarity. We further improve the classical complex-valued convolutional network by proposing a novel complex-valued cross convolution network based on complex convolution principles. Additionally, to resolve the excessive parameter quantity in traditional complex-valued networks, we embed a learnable wavelet decomposition layer that adaptively captures multi-scale signal features while incorporating frequencydomain characteristics. Experimental results demonstrate that our model achieves 98.31% peak recognition accuracy and 64.59% average accuracy on the RML2018.01a dataset, outperforming traditional network architectures by 1.65%~18.91% improvement margins, thus attaining SOTA performance.
Zheng Senxiao , Guo Zhitao , Li Yibo , Yun Zhi
2026, 49(3):146-154.
Abstract:Accurate prediction of equipment remaining useful life (RUL) can optimize maintenance strategies, reduce costs, and improve overall efficiency. However, most existing methods rely on separately extracting temporal and spatial features, which hinders the effective fusion of temporal and spatial information. To address this issue, this paper proposes a dual-axis attention graph convolutional network based on multi-scale feature extraction for RUL prediction. The model first utilizes a cascaded scale-adaptive convolution module to perform multi-scale spatiotemporal feature extraction from raw sensor data, capturing spatiotemporal features across different dimensions. These features are then used to construct a spatiotemporal graph, where graph convolution operations are applied to uncover deep dependencies within the data. Finally, a dual-axis attention mechanism is designed to dynamically weight features along both the temporal and spatial dimensions, thereby enhancing the representation of critical features. In the experimental validation on the FD001 and FD004 subsets of the C-MAPSS dataset, the RMSE and Score were 11.87 and 236 for FD001, and 13.44 and 816 for FD004, respectively. The results show that this method has higher accuracy compared with other methods.
He Zhuoyue , Wang Wei , Liu Jie , Ma Fusheng
2026, 49(3):155-164.
Abstract:When WiFi fingerprint positioning technology utilizes existing access points for WiFi fingerprint positioning, there are problems such as uncontrollable AP quality, redundant fingerprint databases, and large real-time positioning calculation volume. To solve these problems, a Hybrid Robust Access Point Selection method is proposed. This algorithm first analyzes the stability of APs through comprehensive signal stability index in the offline stage, and then analyzes the similarity of APs using mutual information and correlation coefficient to screen out stable APs with low similarity to construct a new lightweight fingerprint database. In the online stage, the log-distance path loss model is used to evaluate the real-time signal quality and select high-quality APs for matching positioning. Experimental results show that compared with the original database, this algorithm effectively eliminates redundant APs, reduces positioning error by 57.79%, and increases the probability of positioning accuracy below 1.5 m from 68.75% to 93.75%.
Zhang Yanna , Duan Tongyao , Guo Yong , Zhang Chaoyang
2026, 49(3):165-174.
Abstract:Characterized by its impulsive features, high intensity, and non-Gaussian properties, impulsive noise disrupts the peak characteristics of linear frequency modulated signals in the fractional Fourier domain. This degrades the performance of parameter estimation algorithms based on the fractional Fourier transform, causing significant estimated biases in non-Gaussian noise environments. To address this issue, a tensor-based parameter estimation method for LFM signals was proposed in impulsive noise environments. First, the noisy LFM signal is segmented by a sliding window along the time dimension to construct a three-dimensional tensor representation. Next, a denoising model is developed via higher-order singular value decomposition, where core tensor components are extracted from tensor signals by applying an energy thresholding criterion. Subsequently, an FRFT-based LFM parameter estimation model is established and solved by the dream optimization algorithm (DOA). Furthermore, the DOA optimization process is iteratively alternated with the tensor denoising procedure. Finally, the chirp rate and initial frequency are estimated by locating the peak position in the FRFT domain. Experimental results demonstrate that tensor representation effectively suppresses impulsive noise compared to the baseline FRFT method. Experimental results demonstrate that when the stability parameter α≥0.8 and GSNR=-4 dB, the RMSE of chirp rate estimated by the proposed method remains stably below 0.1, significantly outperforming other comparative methods. This validates the stronger noise resistance and superior generalization capability of the tensor representation method on both simulated and real-world data.
Bai Xianlang , Zhang Qunli , Xin Zhiqiang
2026, 49(3):175-184.
Abstract:Due to the complex and diverse textures of leather surface defects, existing detection methods often suffer from limited accuracy and elevated rates of missed and false detections. To address these challenges, this paper presents an enhanced defect detection algorithm based on YOLOv5s, incorporating small-object detection techniques and attention mechanisms. Specifically, multiple attention modules are integrated into the backbone network to guide the model′s focus toward defect regions while suppressing interference from background and irrelevant features, thereby enhancing feature extraction. A weighted bidirectional feature pyramid network is introduced in the neck to strengthen feature fusion and interaction across scales. Additionally, a dedicated detection head tailored for small objects is implemented in the head network to improve the localization and recognition of subtle defect features. Experimental results show that the proposed improved method achieves a recall of 92.27% and a detection accuracy of 92.16%, representing improvements of 4.56% and 3.06%, respectively, compared to the baseline model.These enhancements effectively reduce missed and false detections in small-object scenarios and significantly improve the model′s generalization capability, contributing to more robust and comprehensive performance in real-world applications.
Lu Sipeng , Yang Guang , Ma Jianhui , Zhao Jiyuan , Hou Xinmeng
2026, 49(3):185-193.
Abstract:Rotor blades are very prone to deformation due to the harsh working environment. In order to monitor the edge state of rotor blades, this paper proposes a deep learning algorithm CACNet that can quickly segment the edge of rotor blades, a convolutional neural network for edge detection. Due to the high-energy X-ray image noise of the rotor blade, the dynamic blur is large, and the internal structure artifacts of the casing caused by high-energy X-ray transmission overlap in the same part of the image, resulting in extremely low image quality to be detected. For this low-quality image, the improved adaptive Canny operator is used to obtain the rough segmentation information of the image, which is used to assist the neural network to learn more accurate original information of the leaf edge. The model adopts a multi-scale structure, which can fuse the segmentation information at different scales, making the final result clearer and more accurate. In order to further improve the training quality, we also use a composite loss function, which can accurately guide the model to learn the correct information in the training image, so that the final model performs better on the real image. The experimental results show that the proposed algorithm has the ability to quickly and efficiently detect the edge of the rotor blade.
Lu Yupeng , Wang Mingquan , Li Pengbo , Wu Zhicheng , Yang Jie
2026, 49(3):194-203.
Abstract:Radial tire X-ray images exhibit complex textures and diverse defect morphologies, often relying on manual visual inspection for quality control—a process that struggles to balance high precision with real-time efficiency. To address this, a detection model based on an improved version of YOLOv8, named YOLOv8n_RSI, is proposed for detecting air bubble defects in radial tires. First, the RepNCSPELAN4 architecture is introduced to enhance feature extraction capabilities. Second, the SKAttention mechanism is integrated to adaptively select receptive field sizes, improving the model′s detection performance across multiple scales. Finally, the Inner-CIoU loss function is adopted, incorporating center point distance constraints and aspect ratio penalties to effectively enhance detection accuracy. Experimental results demonstrate that compared to the baseline YOLOv8n model, the proposed YOLOv8n_RSI achieves average improvements of 3.5% in precision, 7.0% in recall, and 8.4% in mean average precision. Furthermore, the model′s computational complexity and inference speed indicate its suitability for real-time detection requirements. Preliminary industrial applications also validate the effectiveness of this improved model.
Guo Li , Zhang Xuesong , Li Mengmeng , Jin Hua
2026, 49(3):204-212.
Abstract:To address the issues of degraded feature discriminability, difficulty in recognizing small-scale targets, and occlusion of key body parts caused by high-dynamic human motion in cluttered scenes, an improved fall detection algorithm ICI-YOLO based on YOLOv10 is proposed. The contextual attention aggregation replaces the partial self-attention, achieving global contextual dependency and fine-grained spatial fusion representation. The iterative attentional feature fusion mechanism is incorporated to restructure the C2f of backbone, strengthening semantic representation capabilities for critical regions. An interactive feature fusion network integrating interactive convolution block and cross-scale convolutional feature fusion module is proposed, to improve multi-scale feature fusion capability. Experimental results demonstrate that the enhanced ICI-YOLO model achieves performance gains of 4.3% in recall and 2.2% in mAP@0.5 on the self-constructed human fall behavior detection dataset FALL, while attaining improvements of 2.0% in precision and 1.5% in mAP@0.5:0.95 on the public dataset DiverseFALL10500. Compared with mainstream real-time detection algorithms, the proposed method exhibits superior detection performance.
Liang Hongyu , Yan Kun , Hao Hangbo
2026, 49(3):213-222.
Abstract:This paper presents a deep learning-based autonomous tennis ball retrieval robot designed to address the inefficiencies of manual ball collection. The robot integrates a Raspberry Pi 5B, STM32RCT6 microcontroller, USB camera, and brushless DC motors. Combining a lightweight YOLOv11, an improved DBSCAN clustering-based path planning algorithm, a dual-loop PID controller, and a roller-based collection mechanism, the robot achieves efficient tennis ball recognition, optimized path planning, and autonomous retrieval. The YOLOv11 model was lightened using a StarNet backbone, C3k2_Faster module, and shared convolutional lightweight detection head, significantly reducing computational demands. Experimental results show an 80.8% reduction in parameters, a GFLOPs of only 1.7, an mAP@0.75 of 0.980 6, and a detection speed of 129.7 fps. The DBSCAN-based path planning, optimized through density clustering and a distance-weighted model, enhances the robot′s adaptability and robustness in complex environments. Deployed on a Raspberry Pi, the system accurately recognizes tennis balls under varying lighting conditions, achieves a detection speed of 9~12 fps, and retrieves 7~9 balls per run, demonstrating significantly improved retrieval efficiency and promising practical applications.
Bu Penghui , Tian Longtao , Wang Hang , Yan Yatao
2026, 49(3):223-231.
Abstract:Stereo matching is a key step in binocular stereo vision to perceive the depth information of the scene, and in view of the difficulty of the traditional binocular stereo matching algorithm to effectively solve the problem of matching ambiguity in weak texture areas and complex lighting scenes, a cross-scale stereo matching algorithm combining the texture characteristics of the scene was proposed. Firstly, the left and right images are downsampled by Gaussian to obtain image pairs of multiple scales as the input images of the algorithm, and then the cost calculation of image pairs of different scales is carried out to obtain the initial cost body. Based on the texture characteristics, the input image is divided into texture-rich region and weak-texture region, and the initial cost body is difed at each scale according to the texture region, and the matching cost of the texture-rich region is diffused to the weak-texture region. The optimization guidance filtering algorithm is used to aggregate the cost of the parallax map of each scale. Considering the multi-scale interaction between the cost bodies, the cost fusion is carried out to obtain the final cost body. Subsequently, the final disparity map is obtained by parallax calculation and parallax post-processing. The test results of the dataset of Middlebury website show that after the introduction of the cross-scale stereo matching algorithm combined with the characteristics of texture regions, the mismatching rate of all regions is reduced by 2.35% on average compared with the guided filtering algorithm. Compared with the CSCA algorithm, it is reduced by 0.77% on average. Compared with the guided filtering algorithm, the mismatching rate of the unoccluded region is reduced by 2.29% on average. Compared with the CSCA algorithm, it is reduced by 0.65% on average. It shows that the proposed algorithm can effectively solve the problem of mismatching in weak texture regions, and meet the requirements of high efficiency and high precision in the process of stereo matching.
2026, 49(3):232-242.
Abstract:To address the low accuracy in anomaly behavior detection caused by blurry surveillance and complex road conditions, this paper proposes an optimized YOLOv11 model with multi-module collaboration. First, Dynamic Sample replaces traditional upsampling in the neck network to enhance target localization and recognition precision. Second, a redesigned Multi-Window Attention module is integrated into the final layer of the backbone network, improving the capture of anomaly features in blurry videos while suppressing noise interference. Finally, the lightweight ShuffleNetV2 is adopted as the backbone, significantly reducing model parameters while preserving feature representation capability. Through the introduction of Dynamic Sample module and Multi-Window Attention module, experimental results on the UCF101 and UCF Crime datasets demonstrate that our model improves mAP50 and mAP50.95 by 8.5% and 13.1%, respectively, compared to the original YOLOv11, effectively mitigating false negatives and false positives. By combining ShuffleNetV2, the model′s parameter count is reduced from 2.58 M to 0.82 M. Overall, the optimized YOLOv11 model better meets the demands of real-time scenarios such as traffic surveillance, balancing detection efficiency and accuracy with broad application potential.
2026, 49(3):243-253.
Abstract:In dense pedestrian scenes, severe occlusions, numerous small targets, significant scale variations, and complex environments often lead to missed detections, false detections, and inaccurate localization of pedestrians. To address these challenges, this paper proposes a lightweight dense pedestrian detection algorithm DC-YOLO. The algorithm is based on YOLO11n. In the backbone network, a lightweight feature extraction network, EfficientNetV2S-S3, is proposed to enhance the model′s feature extraction capability for small and multi-scale targets while reducing model parameters and computational costs. In the neck network, the P-LightNeck module is proposed to further improve the feature fusion capability for small targets, achieving collaborative optimization of detection accuracy and efficiency. The RepNCSPELAN4 convolutional module is introduced to strengthen the feature extraction capability for occluded targets through multi-scale convolution and re-parameterization techniques, while improving inference efficiency. A dynamic multi-scale collaborative attention module, DynaMSAttn, is designed to enhance the model′s adaptability to targets of varying scales and complex environments. Experimental results show that, compared to YOLO11n, the DC-YOLO algorithm achieves improvements of 4.7% in mAP@0.5 and 4.5% in mAP@0.5-0.95 on the CrowdHuman dataset, while reducing the parameter count by 46.2%. Comparative experiments and ablation experiments verify that the DC-YOLO algorithm exhibits excellent detection performance and robustness in dense pedestrian detection tasks.

Editor in chief:Prof. Sun Shenghe
Inauguration:1980
ISSN:1002-7300
CN:11-2175/TN
Domestic postal code:2-369