
Editor in chief:Prof. Sun Shenghe
Inauguration:1980
ISSN:1002-7300
CN:11-2175/TN
Domestic postal code:2-369
- Most Read
- Most Cited
- Most Downloaded
2026, 49(2):1-8.
Abstract:In order to address the challenge of inefficient and difficult visual identification and localization of car keys in complex home environments,this paper designs a real-time detection system based on the E-YOLOv8n model. This system captures video streams through a wireless USB camera, performs real-time object detection using a computing terminal equipped with the E-YOLOv8n model, and feeds back the detection results via an audio alarm module. The E-YOLOv8n model is the core component of the system, incorporating several key improvements: first, the network structure is optimized by reconstructing the backbone network with DSConv and streamlining the P5 output to reduce computational redundancy. Second, a DSPPF module is designed to enhance multi-scale feature fusion while reducing computational cost. Third, a Coord attention mechanism module is embedded at the end of the backbone network to focus on key features through coordinate attention and suppress background interference. Finally, a lightweight detection head, the LWD module, is adopted to maintain detection accuracy while improving computational efficiency. Based on a self-constructed car key dataset, experimental results demonstrate that compared to the original YOLOv8n model, the E-YOLOv8n model reduces computational load, parameter count, and model size by 53.8%, 32.8% and 52.4%, respectively, while improving precision by 1.7%. These enhancements achieve significant lightweighting while boosting performance, making it more suitable for deployment on resource-constrained devices commonly found in home environments.
Huang Jiayang , Zhao Yingliang , Han Xingcheng
2026, 49(2):9-17.
Abstract:Because children with autism show abnormalities in visual attention in their early years, it provides an important distinguishing criterion for early intervention. In view of the insufficient attention paid to semantic alignment and dynamic interaction between modalities in autism research, this study proposes a multimodal model that integrates saliency maps and eye movement trajectory data features, providing an objective implementation method for the diagnosis of autism. This method constructs a dualstream network architecture: the U-Net feature extractor is used to process the saliency map, and the temporal convolutional network is utilized to conduct temporal modeling of the eye movement trajectory. To achieve dynamic weighted fusion between two different modal data, a cross-modal attention mechanism is introduced. During the process of time series modeling, eye movement trajectory prediction is carried out simultaneously. Additionally, the prediction error is introduced as a distinguishing feature into the classification process to enhance the classification performance of the model. Through comparative experiments, it was verified that the proposed model achieved an accuracy rate of 98.89% in the early screening task of autism.
Tang Ruixin , Liu Wenzhong , Zhang Junjie , Li Yingchun , Zhang Qianwu
2026, 49(2):18-25.
Abstract:In a high-speed satellite data transmission system, inevitable symbol timing offsets exist between transmitters and receivers, and Doppler effects further amplify these. These issues can beeffectively corrected by clock recovery algorithms. However, existing clock recovery algorithms often suffer from performance degradation due to a large number of parallel processing paths, high implementation complexity, making it difficult to meet the requirements of higher symbol rate and greater timing offset tolerance in resource-constrained systems. Thus, this paper proposes an optimized parallel implementation architecture based on traditional feedforward clock recovery structure. By redesigning the architectures of the timing controller, interpolation filter, and symbol extraction module, the proposed structure enables efficient symbol timing recovery with two samples per symbol. Simultaneously, the LEE timing error detector is enhanced to improve timing error estimation accuracy and timing frequency offset tolerance. Simulation and FPGA board-level tests demonstrate that the proposed architecture can tolerate timing frequency offset up to ±1 000×10-6 under QPSK modulation, and has a stable bit error rate in long-term tests. Furthermore, when implementing a real-time receiver system with 2.5 GBaud symbol rate, the proposed parallel structure saves about 36% of the LUT resources, more than 45% of the Register and about 20% of the DSP resources, showing significant value in resource-constrained high-speed real-time communication systems.
Zhang Lu , Li Keli , Zhi Pengfei , You Yue
2026, 49(2):26-36.
Abstract:The topology of the MMC is suitable for High Voltage Direct Current Transmission (HVDC). However, during operation, internal circulating currents are easily generated, which can lead to arm current distortion, increased system power losses, and reduced stability. In response to the limitations of the P-ROQR controller, such as its restricted control accuracy and insufficient robustness, this paper proposes a circulating current suppression strategy based on the P-ROQR+Q-FRC controller. By introducing the Q-FRC controller on the basis of the P-ROQR controller, the second and fourth harmonic circulating current components are effectively suppressed, and the implementation is simple. To verify the effectiveness of the proposed method, MMC-HVDC simulation experiments were conducted under the same conditions: during steady-state operation, DC voltage sags, DC voltage fluctuations, and three-phase unbalanced operation. The results demonstrate that the MMC equipped with the P-ROQR+Q-FRC circulating current suppressor outperforms the one using the P-ROQR suppressor in both dynamic response and circulating current suppression capability. Specifically, the fluctuation range of the circulating current is reduced by 57.14%, and the fluctuation range of the submodule capacitor voltages is decreased by 7.5%. Consequently, the P-ROQR+Q-FRC suppressor is more suitable for MMC converters.
Zhang Yuan , Fan Chunling , Zhang Chuntang
2026, 49(2):37-44.
Abstract:In the application of traditional sliding mode control in three-phase voltage-source PWM rectifiers, the inherent discontinuous switching characteristics lead to high-frequency oscillations on the DC bus voltage, making it difficult to achieve ideal control performance. Therefore, a dual-loop control strategy that integrates outer-loop Fast Terminal Dynamic Sliding Mode Control and inner-loop feedforward-decoupled PI control is proposed in this paper. Firstly, inner loop employs feedforward decoupling-based PI control to eliminate the coupling terms in the mathematical model and achieve precise tracking of the current waveform. Secondly, the outer loop transfers the switching term to a higher-order differential element and introduces a fast terminal sliding surface design to suppress system chattering, enhance dynamic performance, and ultimately achieve accurate and stable tracking of the reference voltage. The stability of the controller is proved by Lyapunov stability theory, and experimental validation is conducted through simulations using Simulink. The results demonstrate that the proposed strategy effectively suppresses the inherent chattering phenomenon in sliding mode control. During system startup, the DC voltage overshoot is merely 6.5 V, with the peak voltage deviation reduced by over 92% and the settling time shortened by 91% compared to PI control. Voltage step-response tracking completes within 0.02 s. while transient voltage error remains bounded within ±1% under ±50% rated load disturbances. This control method ensures high steady-state accuracy while maintaining excellent dynamic response performance.
Wang Rong , Li Yunxiang , Zhang Shuang , Tian Wen , Shi Huaifeng
2026, 49(2):45-56.
Abstract:To address the issue of inaccurate performance evaluation caused by approximating the communication channel model through line-of-sight links in existing covert communication methods, and to better adapt to real-world communication channel conditions, a novel covert communication model assisted by an unmanned aerial vehicle (UAV) equipped with a multifunctional reconfigurable intelligent surface is proposed. Specifically, based on a hybrid link of line-of-sight and non-line-of-sight, the model utilizes the multifunctional reconfigurable intelligent surface as a flexible relay and optimizes the UAV′s flying location to enhance the signal-to-noise ratio and effective throughput. The hybrid channel model expression is first derived. Then, the covert constraint is transformed from an incomplete gamma function to a constraint related to relative entropy, and the UAV′s transmission power and block length are correspondingly optimized. Finally, by analyzing the functional characteristics of SNR and effective throughput under the relative entropy constraints, the optimal flying location for the UAV is obtained. Compared with non-line-of-sight link channels, the proposed model improved the SNR by 42% at the optimal flight position. Extensive numerical results indicate that the proposed model outperforms existing works in enhancing SNR and effective throughput.
Zhang Junhong , Qu He , Pan Jingtao , Yang Song , Li Lingyu
2026, 49(2):57-64.
Abstract:In the inspection of spiral-welded pipelines, conventional methods often struggle to balance the extraction of temporal and spatial features while maintaining efficient model parameter optimization. To address these challenges, this study proposes a dynamic composite optimization detection model based on deep learning. Ultrasonic guided wave signals are acquired through sensors, where spatial features are extracted using a convolutional neural network and temporal dependencies are modeled via a long short-term memory network. To enhance model robustness, the whale optimization algorithm is employed to optimize four critical hyperparameters: the number of CNN filters, LSTM units, learning rate and Dropout rate. Comparative experiments were conducted on high-noise, low-noise and normal datasets. The results show that the accuracy rates of the proposed detection model have reached 98.88%, 99.7% and 100% respectively, and the average absolute errors have decreased to 0.195 5, 0.177 and 0.095 respectively. It verifies the detection performance advantages in the complex environment of high noise and multiple interference, and provides a theoretical basis for the spiral weld pipeline detection based on ultrasonic.
2026, 49(2):65-78.
Abstract:To address the missed and false detection issues of YOLOv8 in identifying foreign object debris (FOD) on airport runways—caused by small object sizes, random spatial distribution, and significant scale variations among debris—this paper proposes AMMS-YOLOv8, an enhanced model incorporating prior masks specifically for small targets. In the backbone network, an isotropic edge detection operator is introduced to construct EIEStim, strengthening the model′s perception and preprocessing capabilities for subtle edges. Simultaneously, downsampling is replaced by an improved receptive field attention mechanism applied to the detection domain, forming LDFDS to enhance spatial awareness and preserve minute semantic information. Subsequently, the Neck layer is restructured to enable multi-scale feature aggregation, developing CCFPN to improve semantic perception of multi-scale debris. Finally, prior FOD mask features are embedded into the detection head and concatenated with deep features to create MSN-Head, thereby amplifying spatial perception. The model′s detection capability was validated using a self-built complex-scenario FOD dataset. On this dataset, AMMS-YOLOv8 achieved improvements of 1.8% and 1.7% in mAP50 and mAP50.95 respectively, with precision, recall, and F1-score reaching 0.971, 0.976, and 0.973—marking significant enhancements over the baseline network. Experimental results confirm the efficacy of these improvements. Furthermore, robustness and generalizability were evaluated through comparative experiments using a hybrid dataset (combining complex-scenario FOD data with FOD-A) and a complex transmission line FOD dataset, demonstrating performance gains across all metrics.
Hao Zhenxing , Li Wei , Yang Rui , Xu Yaowei , Hao Yongxing
2026, 49(2):79-88.
Abstract:In order to improve the real-time performance, state estimation accuracy and intelligence degree of vehicle power battery pack monitoring system, a monitoring system architecture of vehicle power battery pack based on cloud-edge collaboration is proposed. By deeply integrating edge computing real-time response with cloud big data analysis capabilities, a multi-level collaborative monitoring system is built. The system utilizes STM32 series chips to build a high-precision hardware architecture, integrating data acquisition, equalization control, insulation detection, and 5G networking location modules. An innovative dynamic compensation open circuit voltage-ampere hour integration algorithm is proposed, incorporating multi-parameter correction mechanisms including temperature and cycle count, achieving SOC estimation error ≤±1.2%. The experimental results demonstrate that the system achieves dynamic accuracies of ±0.11% for voltage acquisition and ±0.4% for current acquisition, with cell voltage variance reduced by 99.1% after equalization. The key indicators are superior to the national standard requirements. This research provides a cloud-edge collaborative solution with high-precision, low latency, and scalability for vehicle batteries safety management. The system has certain engineering application value.
Chen Qidong , Xu Guoliang , Deng Ruixiang , Wu Hao
2026, 49(2):89-98.
Abstract:The size of the dataset is one of the key factors affecting the performance of deep learning models. Since the performance of deep learning models is highly dependent on the size of the dataset, the amount of data required to achieve a specific accuracy is usually difficult to estimate. This problem also exists in the intelligent design of metamaterials, and has become an important factor restricting the accuracy and efficiency of modeling. To this end, a dynamic data generation and model performance evaluation framework is proposed to achieve dynamic monitoring of the size of the dataset and model performance. In order to improve the efficiency of dynamic evaluation of the model and effectively alleviate the catastrophic forgetting phenomenon, a continuous learning strategy is designed so that the model only needs to learn new data during the dynamic evaluation process while maintaining the memory of existing knowledge. Experimental results show that the average prediction accuracy of the model trained based on this continuous learning strategy can reach 93.28%, and the average forgetting rate is 3.68%, which fully verifies the effectiveness of the model in alleviating the problem of catastrophic forgetting.
Zhang Hang , Liu Fangzi , Luo Wanting , Li Qi , Wang Ze
2026, 49(2):99-106.
Abstract:This study proposes an innovative method for estimating obstructive sleep apnoea (OSA) severity by integrating sleep structure and individual priori. This approach aims to overcome limitations in current OSA assessment, particularly the direct quantification of the apnoea-hypopnoea index (AHI) and the integration of multi-source information. The proposed method initially integrates multidimensional features derived from all-night nasal flow, thoracic/abdominal movements, and oxygen saturation signals. It then distinctively incorporates sleep structure parameters with clinical a priori knowledge. Subsequently, a gradient boosted regression model predicts the AHI using these multi-source features. Validation on the MESA dataset demonstrated the model′s performance, achieving R2 of 0.695, MAE of 7.46 events/h, and RMSE of 10.57 events/h. The proposed method outperformed multiple baseline models, and specifically, its R2 score showed a relative improvement of 12.46% compared to the next-best model, Random Forest, demonstrating its superiority. These results significantly surpassed those of conventional assessment methods. Feature importance analysis highlighted that parameters such as the oxygen desaturation index, N1 sleep stage percentage, and BMI were key contributors to AHI prediction. These findings indicate that the proposed method offers an effective tool for the direct, quantitative assessment of OSA severity. Furthermore, it provides a more accurate, continuous quantitative index to support clinical diagnosis and decision-making.
Guo Haoyu , Jia Cili , Li Yongping , Pan Chencheng , Shen Long
2026, 49(2):107-116.
Abstract:Aiming at the problem of inaccurate attitude and position estimation of quadrotor unmanned aerial vehicles (UAVs) in signal interference environments, a multi-sensor data fusion method based on adaptive extended Kalman filter (AEKF) is proposed. This method fuses GPS and IMU data and adjusts the noise covariance matrix in real time to improve the stability and robustness of state estimation. By establishing the UAV dynamics model and sensor observation model, the AEKF algorithm process is derived, and a simulation system is built on the MATLAB platform. Under different GPS signal interference conditions, the estimation errors and convergence speeds of EKF, UKF, and AEKF algorithms are compared. The results show that within the 10-second interference period of GPS loss, the position root mean square error (RMSE) of AEKF is reduced by 29.8% compared to EKF (from 0.57 m to 0.40 m) and by 20% compared to UKF (from 0.50 m to 0.40 m), verifying the advantages of AEKF in anti-interference ability and error convergence. This research provides technical support for the precise positioning and stable control of UAVs in complex low-altitude airspace.
Jiao Wenbo , Zhang Xiangfeng , Jiang Hong , Han Wenxu , Gao Bo
2026, 49(2):117-127.
Abstract:Aiming at the problems of low search efficiency, easy to fall into local optimum, and too many redundant nodes in the path planning process of mobile robots in complex obstacle environments, this paper proposes a path planning method based on the fusion of genetic algorithm and particle swarm optimization algorithm. First, the improved genetic algorithm is used to generate a high-quality initial path population, which provides a priori search guidance for subsequent particle swarm optimization, increases the diversity of the population, and accelerates the convergence of the algorithm; second, a dual strategy based on the change of fitness and iteration progress is proposed to dynamically adjust the crossover probability, and a nonlinear dynamically diminishing inertia weight adjustment method is proposed, so as to efficiently balance the algorithm′s global and local search; next, a vector fork-based path planning method is proposed to solve the problem of low search efficiency in the path planning process. Then, the vector fork product-based geometric redundant node discrimination criterion and the obstacle safety distance threshold discrimination method are proposed to effectively remove the redundant nodes and transition nodes in the path, so as to shorten the path length and improve the optimization ability of the path; finally, simulation experiments are carried out in five benchmark test functions and two different raster maps environments to verify the optimization performance of the algorithm. The experimental results show that compared with the genetic algorithm, particle swarm optimization algorithm, differential evolution algorithm, gray wolf optimization algorithm, sparrow search algorithm, dung beetle optimization algorithm and crown porcupine optimization algorithm, the proposed algorithm in this paper reduces the path length by an average of 3.74% and the runtime by an average of 23.13% in a 20×20 raster map; and in a 30×30 raster map, the path length reduces by an average of by 4.83% and runtime by 19.95% in 30×30 raster maps. In addition, the number of path nodes planned by the algorithm in this paper is relatively small, indicating that the algorithm proposed in this paper can not only effectively shorten the path length and reduce the running time, but also effectively simplify the path, showing good optimization ability.
Du Junnan , Yang Wen , Liu Zhilong , Wang Cheng , Wang Tianyi
2026, 49(2):128-137.
Abstract:To address the issues of inadequate accuracy, complex model architecture, and poor generalization in detecting standardized work uniform within industrial scenarios using existing object detection networks, a novel high-precision lightweight model named SGAD-YOLO based on YOLO11 is proposed.First, the C3k2 module is improved by combining the StripBlock structure and CGLU mechanism. Through multi-level feature processing and dynamic feature enhancement, the model′s perception of slender features and complex textures is improved, while the model′s parameters and computational complexity are reduced. Second, the AFGCAttention mechanism is introduced to enhance the model′s focus on key regions and effectively suppress background noise interference through the dynamic fusion of global context information and local features. Finally, the Detect-SEAM detection head is redesigned to improve the model′s detection accuracy for occluded and small objects in complex environments. Experimental results demonstrate that the improved algorithm achieves mAP@0.5 values of 93.6% and 94.6% on the power grid field operation dataset and the public Roboflow 5 dataset, respectively—representing improvements of 1.5% and 2.1% over the baseline model. Moreover, its parameters and computational complexity are reduced by 8.3% and 7.4%, respectively. This proves that the SGAD-YOLO algorithm has better detection performance for standardized work uniform detection tasks in industrial scenarios.
Sun Jiachen , Pang Cunsuo , Ren Ziran , Yang Zhiliang , An Jianping
2026, 49(2):138-146.
Abstract:In recent years, UAV technology has been widely used in many fields. Radar detection is widely used because of its advantages of long-distance, high-precision positioning and rapid response, and the research on micro-Doppler characteristics of UAV has attracted much attention. However, the echo signal of UAV is susceptible to interference in complex environment, resulting in time-frequency characteristics distortion. The traditional time-frequency analysis methods have limitations in dealing with such problems. Therefore, this paper proposes a time-frequency curve reconstruction algorithm for UAV based on deep learning. By designing autoencoder model based on convolutional neural network SelfNet, effective information is extracted from noise interference and channel distortion, and high-quality time-frequency curve is reconstructed. SelfNet uses the encoder to extract the characteristics of time-frequency curve, and restores the signal structure through the decoder. The experimental results show that the average PSNR is 17.767 2, and the average SSIM is 0.431 7, which is better than classical convolutional neural networks such as GoogLeNet and ResNet, and its generalization ability is verified by small sample experiments and transfer learning, which provides an idea for UAV time-frequency curve reconstruction in complex environment.
Liu Xinran , Zhang Yi , Wei Haifeng
2026, 49(2):147-156.
Abstract:To address the issues of current harmonics and torque pulsation caused by the inverter′s nonlinear factors,which reduce the MTPA control accuracy of IPMSM,this paper proposes a collaborative control strategy that combines virtual DC signal injection MTPA with a proportionalintegral resonant controller. First,the virtual DC signal injection method injects a virtual DC signal into the d-q axis feedback current,and simultaneously calculates the virtual power response using voltage information to achieve precise tracking of the optimal current vector angle,thereby avoiding current and torque pulsation. Second,a quasi-resonant controller is introduced,which is combined with a traditional current loop PI controller to form a PIR composite controller. By leveraging its high gain characteristics for specific harmonic frequencies (such as 5th and 7th),it compensates for the low-frequency harmonics introduced by inverter nonlinearity. Additionally,the introduction of the PIR controller further improves the quality of voltage and current waveforms,enhancing the precision of MTPA control. Together,these two components form a dual closed loop of harmonic suppression-efficiency optimization.The experimental results show that the proposed cooperative control strategy can effectively suppress the 5th and 7th harmonics and reduce the total harmonic distortion rate of current under different speed scenarios; under the same load torque conditions,compared to traditional MTPA control and virtual DC signal injection MTPA control,the proposed collaborative control strategy requires lower current values and achieves higher MTPA control accuracy.
2026, 49(2):157-168.
Abstract:To address the issues of low search efficiency, multiple path waypoints, and insufficient environmental adaptability in traditional RRT algorithms for epidemic prevention robot path planning, an improved RRT path planning algorithm based on Voronoi skeleton graphs is proposed. This algorithm constructs an offline skeleton graph from the map using a generalized Voronoi diagram and employs the empty circumcircle property of the Delaunay triangulation for local real-time updates, ensuring the skeleton graph′s timeliness in unknown environments. Based on the skeleton graph, an initial heuristic path is quickly obtained, and key path nodes are generated as sub-goals for the RRT algorithm. Elliptical constraints and an attractive field bias are introduced between sub-goal nodes to accelerate sampling and reduce planning time. Finally, an adaptive multi-segment pruning strategy based on a double-pointer technique is designed to smooth the path. Simulation results demonstrate that compared to existing improved algorithms, the proposed method reduces the average number of sampled nodes by 55.57%, shortens the average path length by 6.45%, and decreases the average planning time by 51.44% in complex scenarios, effectively reducing planning overhead and enhancing path planning efficiency.
Wei Wenqing , Wei Kun , Zhang Jianhui
2026, 49(2):169-180.
Abstract:To address the challenges of path planning for intelligent agents in complex environments, particularly issues such as slow algorithm convergence, high path redundancy, and insufficient smoothness, this paper proposes a target-biased bidirectional RRT* algorithm based on KD-tree (KDB-RRT*). The algorithm introduces a bidirectional search strategy based on RRT*, incorporates a KD-tree structure to accelerate node lookup, constructs a target-biased dynamic circular sampling strategy to balance search efficiency, designs a bidirectional growth guidance model based on gravitational fields, implements adaptive step-size adjustment using the Sigmoid function combined with obstacle density, and employs the DP algorithm for original path pruning and cubic B-spline curves for path smoothing. The feasibility of KDB-RRT* is verified in “Z-shaped” and “loop-shaped” simulation environments, and comparative experiments are conducted with RRT*, Bi-RRT, and Improved RRT* algorithms in various complex map environments. Finally, path planning experiments are performed on a ROS robot. In the “Z-shaped” and “loop-shaped” simulation environments, compared with the RRT* algorithm, KDB-RRT* reduces the average planning time by 70.2% and 28.0%, decreases the average path length by 4.8% and 10.4%, and increases the node utilization rate by 16.27% and 13.58%, respectively. The results show that the KDB-RRT* algorithm provides a new method for efficient path planning in unstructured environments, and its dynamic sampling model and path optimization framework have important reference value for mobile robot navigation systems.
Zhang Yajun , Miao Haoyuan , Ma Wei , Ma Chong
2026, 49(2):181-191.
Abstract:In road damage detection tasks using UAV aerial images, existing algorithms face challenges including high computational complexity, false negatives, and false positives in complex backgrounds. To address these problems, we propose a lightweight road damage detection model, DFS-YOLO. First, we introduce the C2f-DWR module, which employs a parallel structure with dilated convolutions of multiple dilation rates to expand the model′s receptive field and enhance the utilization of high-level semantic information. Second, we design a lightweight Faster Hierarchical Scale-based Feature Pyramid Network (FHSFPN) to reduce model complexity while improving feature fusion. Finally, we introduce the ShapeIoU loss function, which focuses on the shape and scale of road damage to improve the model′s robustness. Experimental results demonstrate that DFS-YOLO outperforms YOLOv8s, achieving a 4.6% and 2.1% improvement in mAP50 on the China Drone and UAPD datasets, respectively. Additionally, the model reduces the number of parameters and computational complexity by 39.1% and 20.4%, respectively, achieving a good balance between lightweight design and accuracy. These results highlight its significant potential for practical applications.
Xiao Feng , Yang Wenhao , Zhang Wenjuan , Huang Shujuan , Zhou Yujie
2026, 49(2):192-202.
Abstract:Targets in remote sensing images are often elongated, zigzagging and other complex morphology, and accompanied by large scale changes and strong background interference and other factors, resulting in the existing detection methods are prone to lack of detection and misdetection, it is difficult to meet the demand for high-precision detection, in this regard, an improved remote sensing image target detection algorithm TriD-DETR. First, by dynamically adjusting the shape of convolutional kernel and optimizing the channel adaptation and residual connection methods, a DKFE feature extraction module is designed, which is able to adaptively focus on the elongated and zigzagging local regions, thus accurately capturing the target features; second, in order to improve the model′s ability of locating and identifying the complex targets. DATE in-scale feature interaction structure is proposed, which introduces a deformable attention mechanism on the basis of reconfiguring the Transformer encoder and enhances the model′s ability to capture high-level features and deep semantic information; finally, for the multi-scale feature fusion part, the DBFB diverse branch fusion block, which enriches the feature space by combining diverse branches of different scales and complexity, thus enhancing the expressive ability of the model. The experimental results show that the TriD-DETR algorithm achieves 86.8% and 94.1% mAP on the DIOR and RSOD datasets, respectively, which are 1.2% and 2.3% higher than the original model RT-DETR-R18, which fully proves the reliability and efficiency of the TriD-DETR algorithm.
Gong Yujie , Wang Xu , Guo Haijie , Ding Zhixing , Cui Xuehong
2026, 49(2):203-211.
Abstract:To address the technical challenges of dense target adhesion and occlusion-prone small objects in industrial pelletized ore image segmentation, this study proposes a instance segmentation method (YO-SAM2) integrating YOLOv11 and SAM2. Firstly, the CSC module is introduced to improve the C3k2 module in YOLOv11, enhancing the network′s capability to represent features of densely clustered small targets. Second, a Small-Target Hybrid Fusion Feature Pyramid Network (SHFPN) is designed to augment feature map outputs at the P2 layer for fine-grained detail capture, incorporating cross-layer interactions and a content-guided attention mechanism to optimize multi-scale feature fusion. Additionally, a Decoupled Spatial-Channel Upsampling module (DSCU) is proposed to replace conventional upsampling, generating more discriminative feature representations. Finally, parameter-efficient fine-tuning of the SAM2 segmentation model is achieved via a learnable Adapter, significantly improving adaptability and generalization in industrial scenarios. Experimental results demonstrate that YO-SAM2 achieves a state-of-the-art mIoU of 90.3% on the pelletized ore dataset, outperforming mainstream segmentation algorithms such as Mask R-CNN and YOLOv8-seg. This method effectively resolves the challenges of accuracy and robustness in industrial pellet segmentation, offering a reliable technical solution for intelligent industrial quality inspection.
Meng Xiangyuan , Wu Xinyue , Zhang Yanhao , Gao Runze , Shan Huilin
2026, 49(2):212-220.
Abstract:As an adjacent planet with profound connections to the Earth in cosmic evolution, the semantic segmentation of Martian surface geomorphological features not only facilitates the construction of a cognitive framework for understanding the dynamic formation and evolutionary mechanisms at the planetary scale but also establishes a multifaceted research paradigm in the field of planetary science. This holds particular significance in refining the theoretical framework of planetary evolution and validating astrophysical models, thereby possessing critical scientific value. However, the analysis of Martian surface imagery encounters multifaceted technical challenges primarily characterized by complex and variable lighting conditions, low structural degree of topographical features, and pronounced heterogeneity in target scale distribution. Collectively, these characteristics form key technological bottlenecks in the intelligent interpretation of planetary surfaces. To address these issues, this paper proposes a Mars surface image segmentation algorithm based on dynamic ternary attention mechanism. Our approach synergistically optimizes adaptive feature fusion and dynamic attention mechanisms to enhance segmentation accuracy. First, we develop a dynamic ternary attention module that automatically adjusts branch significance weights, enabling dynamic focus on local and global features for typical Martian landforms like rocks and dunes. Second, an adaptive bidirectional feature fusion module is designed to reconcile spatial and semantic information conflicts across scales. Moreover, a channel-attentive separable convolution is proposed to reduce parameter complexity while enhancing model generalization capabilities. Experimental results demonstrate that the proposed algorithm achieves 89.06% accuracy and 72.33% mean intersection over union on the S5Mars dataset,effectively extracting and integrating multi-scale features to significantly enhance segmentation precision for Martian surface imagery.
Li Dou , Wang Jingyu , Ren Guoyin , Chu Jiaxing
2026, 49(2):221-229.
Abstract:Bottom argon blowing in the ladle is a critical step in steelmaking, where the exposed surface area of molten steel (argon flower) serves as an important indicator for evaluating the blowing efficiency. To achieve quantitative analysis of the argon flower, image segmentation techniques are employed. However, existing segmentation networks generally suffer from large parameter sizes, high computational resource requirements, and limited segmentation accuracy, making them unable to meet the real-time and efficiency demands of industrial production. This paper proposes an innovative argon flower segmentation network named ArgusFusion. The network adopts a U-shaped architecture and integrates convolutional modules with a novel Global Multi-Layer Perceptron-based Attention (Glo-MLP attention) mechanism during feature extraction and reconstruction stages to facilitate efficient information exchange. In the bottleneck layer, an improved Multi-Scale Channel Attention Mixer (MACA-Mixer) is introduced to enhance feature representation. Additionally, a Adaptive Hierarchical Feature Fusion (AHFF) is incorporated into the skip connections to optimize boundary segmentation. Experimental results on an industrial argon flower dataset demonstrate that ArgusFusion achieves an IoU of 88.90% with only 0.51 M parameters and 1.38 GFLOPs, showcasing high segmentation accuracy and low computational cost, fully meeting the requirements of real-time industrial applications.
Zhang Shang , Zhu Shuai , Zhang Yue
2026, 49(2):230-241.
Abstract:To address issues such as low detection accuracy, high model complexity, and insufficient attention to defect boundary information in the steering knuckle surface defect detection process, this paper proposes an improved RT-DETR-based steering knuckle surface defect detection algorithm GSG-DETR. First, a multi-scale edge information transfer module GLOFT is designed to improve the backbone network by enhancing the capture and transfer of edge information, thus increasing the model′s sensitivity to defect boundaries. Next, a selective edge information aggregation module SBA is introduced into the neck network, constructing an adaptive fusion mechanism between low-resolution boundary information and deep semantic features, optimizing the alignment strategy of multi-scale defect boundary features. Finally, a GroupNorm-based structured pruning method is employed to eliminate redundant coupled layers, reducing the model′s parameter count and computational complexity. Experimental results demonstrate that the GSG-DETR algorithm achieves an mAP50 of 88.2% in the steering knuckle crack detection task, a 2.0% improvement over the baseline model, with a 34.3% reduction in parameters and a 32.1% reduction in computational complexity, while the FPS increases to 105.1 frames. Further validation on the NEU-DET dataset shows that the improved algorithm yields a 4.3% increase in mAP50 compared to the baseline model. In summary, GSG-DETR not only excels in detection accuracy but also aligns better with practical applications.
Zhang Li , Peng Yan , Han Yuanyuan , Chen Xin , Wu Juan , Li Cuiling
2026, 49(2):242-252.
Abstract:As one of the most promising cutting-edge interdisciplinary fields of the 21st century, terahertz technology has demonstrated revolutionary application prospects in communications, imaging, biomedicine and other areas. This article systematically analyses the theoretical foundation, current characteristics, core challenges, and innovative pathways for the development of terahertz technology talent in our country, based on the establishment of a talent database for terahertz technology. The research found that the scale of the terahertz talent pool in our country is rapidly expanding, but there are structural contradictions. By constructing a multidisciplinary collaborative training model, optimising the policy support system, and deepening international cooperation, it is hoped to create a talent ecosystem that adapts to technological innovation and industrial development, providing talent support for our country to achieve a leap from following to leading in the terahertz field.

Editor in chief:Prof. Sun Shenghe
Inauguration:1980
ISSN:1002-7300
CN:11-2175/TN
Domestic postal code:2-369