• Volume 48,Issue 17,2025 Table of Contents
    Select All
    Display Type: |
    • >Research&Design
    • Research on self-localization method using a single-channel microphone based on dipole sound source

      2025, 48(17):1-8.

      Abstract (116) HTML (0) PDF 4.31 M (122) Comment (0) Favorites

      Abstract:In this paper, we propose an indoor self-localization algorithm for mono microphones, which achieves efficient localization in a limited computational resource environment by generating two pairs of dipole sound fields and combining them with orthogonal detection techniques. Compared with the traditional time-driven mode, this algorithm introduces the frequency-division multiplexing technique, which dramatically improves the localization speed and shows excellent robustness at the same time. Numerical simulation experiments show that under the same environmental conditions, the improved algorithm in this paper is able to accurately estimate the azimuth and elevation angles by driving two dipoles synchronously, and the average error is no more than 0.5° compared with the original algorithm, while the positioning time is shortened by 3 to 4 seconds. The algorithm not only significantly outperforms the existing methods in terms of positioning timeliness, but also effectively ensures the accuracy of positioning, which has significant application value in the field of indoor positioning.

    • A Ku-band three-way power dividing/combining network in rectangular waveguide

      2025, 48(17):9-15.

      Abstract (80) HTML (0) PDF 5.29 M (76) Comment (0) Favorites

      Abstract:In order to solve the deficiency of the current three-way power dividers/combiners, a novel Ku-band three-way power divider/combiner in rectangular waveguide was presented, which was composed of a branch waveguide directional coupler, a coplanar magic T, two novel 90° waveguide bends and a waveguide matched load. Based on the power divider/combiner, a Ku-band three-way power dividing/combining network was established. The structure was modeled and optimized using three-dimensional electromagnetic simulation software HFSS. The sample was fabricated and measured. The measured results show that the transmission loss of the network is less than 0.12 dB, the return loss is better than 17 dB between 13.75 GHz and 14.5 GHz. After calculation, the power combination efficiency is greater than 98.6%. The performance of the Ku-band three-way power dividing/combining network is excellent.

    • Solar panel defect detection based on improved YOLOv11

      2025, 48(17):16-25.

      Abstract (120) HTML (0) PDF 16.26 M (123) Comment (0) Favorites

      Abstract:In order to solve the problems of low accuracy and slow speed of the current solar panel defect detection method, a defect detection algorithm based on improved YOLOv11 was proposed. Firstly, the SimSPPF module is introduced into the backbone network to optimize the feature extraction process. In addition, the Slide Loss function is used to improve the attention of the model to difficult samples. At the same time, the LSKA attention mechanism is introduced into C2PSA, the split convolutional kernel is used to enhance the feature extraction ability, and the Mish activation function is used to enhance the network nonlinearity. Finally, the Strip Pooling strategy was introduced to improve the adaptability of the model to the changes of target shape and distribution. The experimental results show that the improved algorithm Persion reaches 86.8%, which is 3.3% higher than the original algorithm, mAP@0.5 reached 90.1%, an increase of 2.6% compared with the original algorithm, The detection speed reaches 149.254 fps, which meets the requirements of high precision and high efficiency of solar panel defect detection in industrial production.

    • Gait recognition algorithm based on array sensing acquisition strategy

      2025, 48(17):26-34.

      Abstract (79) HTML (0) PDF 9.90 M (79) Comment (0) Favorites

      Abstract:First of all, in order to solve the problem of drift and sensor failure in the process of installation and wearing of a single sensor, an array sensor acquisition system is designed in this paper. Then according to the common five-point array layout, theoretical analysis and comparative experiments are carried out to determine the best sensor layout of the system. Next, a gait dataset with 40 people and 7 patterns is constructed. In view of the problems of global information loss, large computation and memory consumption, and insufficient boundary information processing in the gait recognition network of embedded deployment, improvements are made. An encoder-decoder based parallel attention convolutional network is proposed. Finally, a multi-mode motion gait recognition experiment is set up to verify the performance of the algorithm. The experimental results show that the algorithm can quickly and accurately identify 7 common human gait patterns with an average accuracy of more than 95%, which has good performance.

    • >Intelligent Control & Performance Testing
    • Estimation of human lateral roll state based on pressure similarity model

      2025, 48(17):35-43.

      Abstract (69) HTML (0) PDF 6.87 M (65) Comment (0) Favorites

      Abstract:Lateral turning is a critical aspect of nursing care for individuals with disabilities, and the autonomous execution of the lateral turning process by devices has become one of the key tasks in the development of unmanned nursing care. To enhance the safety and intelligence of lateral turning devices, pressure information, which is inevitably generated during the human lateral turning process, is utilized as a core indicator to construct a model for estimating the human lateral turning state and guiding the control system′s execution. Analytical mechanics is employed to construct a matrix function from seven pressure points at the shoulders and hips during the lateral turning process. Based on anatomical principles, height and weight are incorporated as variable parameters to achieve active adaptation of the model. Cosine similarity and Pearson correlation coefficient are employed to jointly assess actual and theoretical pressure, yielding a minimum similarity of 0.826 9, thereby improving the model′s robustness and enabling human rollover state estimation. The constructed human lateral turning state estimation model further analyzes the lateral turning movements of individuals with disabilities, which holds significant implications for the intelligentization of rehabilitation aids, health status assessment, and routine home care.

    • 3D object detection based on attention residual network and mixed pooling

      2025, 48(17):44-53.

      Abstract (66) HTML (0) PDF 6.92 M (61) Comment (0) Favorites

      Abstract:Aiming at the problem of low detection accuracy of pedestrians and cyclists in 3D object detection tasks, Voxel-RCNN is used as the baseline algorithm for improvement. A 3D object detection algorithm based on residual attention network and hybrid pooling is proposed to improve the detection accuracy. Firstly, a new 2D backbone network integrating residual network and attention mechanism is designed. The residual network structure is used to enhance the adaptability of the model to different object sizes. At the same time, the attention mechanism is introduced to focus on the key area and improve the feature representation ability. Secondly, a new MLP pooling method is proposed, and an attention pooling method combined with convolution is designed. The two pooling methods can not only effectively retain the local geometric details of small objects, but also enhance the expression ability of global semantic features, thereby further improving the ability to capture diverse objects in complex scenes. Experimental results on the public dataset KITTI show that the mean average precision (mAP3D) of the Pedestrian and Cyclist categories reached 54.06% and 76.85%, respectively, which is 3.43% and 3.03% higher than the baseline algorithm. The experimental results demonstrate the effectiveness of the proposed method.

    • Research on permanent magnet synchronous motor control of mining locomotive under load mutation

      2025, 48(17):54-65.

      Abstract (56) HTML (0) PDF 10.27 M (64) Comment (0) Favorites

      Abstract:In order to solve the problems of insufficient anti-interference ability, poor system accuracy and slow convergence speed of permanent magnet synchronous motor control caused by complex underground working conditions of coal mine electric locomotive, an intelligent control method of BP neural network PID based on improved dung beetle-osprey algorithm was proposed. Firstly, the improved dung beetle algorithm and the osprey algorithm were combined to design an improved dung beetle-osprey algorithm, and the global search strategy of the osprey algorithm replaced the rolling ball stage of the dung beetle algorithm. Secondly, the sinusoidal learning factor is introduced to improve the exploration ability of the algorithm, the dynamic spiral search improves the global search performance of the algorithm, and the adaptive t-distribution perturbation and piecewise function methods jump out of the local optimal to improve the quality of the solution. The algorithm optimizes the learning factor and inertia factor of the BP neural network, so that the neural network can output the best PID parameters more quickly. Finally, the voltage feedforward decoupling is added to offset the coupling term of the permanent magnet synchronous motor and improve the dynamic response of the permanent magnet synchronous motor. Through Matlab/Simulink simulation and RT-LAB experiments on the semi-physical platform, the BP neural network PID controller of the improved dung beetle-osprey algorithm is compared with the traditional PID controller, and the results show that when the target speed is 1 200 r/min after the load torque is abrupt, the recovery time and overshoot of the improved dung beetle-osprey algorithm BP neural network PID controller are reduced by about 98.3% and 66%, respectively, compared with the traditional PID control.The experimental results show that compared with PID control, PSO control and BAS-PID control, the IDBO-OOA-PID speed control speed fluctuation is smaller, the speed return time to the set value is shorter, and the current response is more stable when the load burst and speed are abrupt.It is verified that the IDBO-OOA-PID speed controller has good anti-interference ability, stability and robustness.

    • Intelligent enhanced label recognition method for inventory of power grid equipment

      2025, 48(17):66-72.

      Abstract (64) HTML (0) PDF 6.38 M (49) Comment (0) Favorites

      Abstract:Under the overall development trend of smart grid, in order to solve the problem of inefficiency in large-scale equipment management, this paper designs and proposes a large-scale equipment inventory intelligent system for power grid. The system provides a systematic solution for the automatic inspection and efficient management of large-scale power grid equipment by integrating robot inspection technology, Internet of Things data collection means and intelligent data processing technology. However, the problem of RFID tag conflict has become a key bottleneck for the efficient operation of the system in dense tag scenarios. Therefore, based on the design of the intelligent system, this paper proposes an intelligent enhanced Dynamic Frame Queuing (EFQ) algorithm for the system. The EFQ algorithm improves the recognition efficiency and system stability in high-density scenarios through dynamic frame adjustment and priority optimization strategies. In this paper, the performance of the EFQ algorithm is compared with the IABS algorithm and the ICT algorithm. Experimental results show that the EFQ algorithm has significant advantages in terms of throughput and collision rate, the collision rate is reduced by more than 30%, and the system efficiency is increased by about 15%. Although there is no significant difference in recognition time compared with other algorithms, the overall performance of the EFQ algorithm is more stable, especially for device management requirements in dense labeling scenarios.

    • >Theory and Algorithms
    • An improved jitter decomposition algorithm based on TLC

      2025, 48(17):73-80.

      Abstract (65) HTML (0) PDF 5.04 M (57) Comment (0) Favorites

      Abstract:Jitter is one of the important reasons to limit the upper limit of circuit system performance, and it is necessary to decompose the various components of jitter. The traditional time delay correlation jitter decomposition algorithm is insufficient in the accuracy and stability of the decomposition results and the timeliness of the operation. In this paper, we propose a fitting method combined with least squares method, which can effectively extract the jitter component information from the TLC function affected by the stochastic jitter component, thus avoiding the influence of stochastic data on the system of equations and enhancing the stability of the algorithm. At the same time, the FFT algorithm is introduced, so that the TLC algorithm can decompose the frequency points of the jitter component more accurately and efficiently, which reduces the running time of the algorithm. To verify the effectiveness of the improved algorithm, this paper compares the proposed improved algorithm with the existing improved algorithm. The experimental results show that the jitter parameters decomposed by the improved TLC algorithm are more accurate, the stability is improved by 41%~54%, and the timeliness is improved by 52%~68%.

    • Multimodal load forecasting for IES based on modal decomposition and multi-model fusion

      2025, 48(17):81-93.

      Abstract (76) HTML (0) PDF 12.04 M (59) Comment (0) Favorites

      Abstract:To address the challenges posed by the randomness and high volatility of multiloads in integrated energy systems, existing load forecasting methods often struggle to achieve high accuracy and stable prediction performance. To overcome this issue, this paper proposes a short-term load forecasting method for IES based on modal decomposition and multi-model fusion. First, the maximum mutual information coefficient is used for feature selection, aiming to effectively identify key factors closely related to load variation. Next, sample entropy combined with mutual information is employed as the fitness function, and the exponential triangular optimization algorithm is applied to obtain the optimal parameter combination for variational mode decomposition (VMD), enabling effective decomposition of IES loads into multiple intrinsic mode functions. Then, permutation entropy is used to filter the decomposition results and extract low-frequency and high-frequency components that reflect the load variation characteristics. Finally, a BiLSTM network is used to predict the low-frequency components, while a BiTCN-LPTransformer-BiGRU model is applied to forecast the high-frequency components. The final load prediction is obtained by aggregating the predictions of all components. Verification using actual load data, specifically for spring electricity load, shows that the model achieves an RMSE of 118.394 kW, an R2 of 0.991, and an MAPE of 0.351%. Compared to traditional models, this approach significantly improves prediction accuracy, validating the effectiveness of the proposed method.

    • Communication topology optimization of power distributed dispatch based on greedy algorithm

      2025, 48(17):94-104.

      Abstract (47) HTML (0) PDF 12.04 M (56) Comment (0) Favorites

      Abstract:In the power distributed economic dispatch based on multi-agent consensus algorithm, the information interaction between generation units relies on the communication network, so the topological structure of the communication network significantly impacts the performance of the dispatch system. Aiming at the dispatch system under the event-triggered consensus algorithm, a communication network topology optimization method based on greedy algorithm is proposed to further improve the convergence rate of the algorithm while reducing the increased communication burden of the system. The method introduced the characteristic ratio as the performance evaluation index to take into account the convergence rate of the consensus algorithm and the communication frequency of the system. By successively adding edges to the communication network topology to ensure that the characteristic ratio increment is maximized at each step, so as to realize the optimization of the communication topology and improve the overall performance of the dispatch system. The simulation results show that the proposed method improves the convergence rate of the consensus algorithm by 19.1% compared with the topology optimization method that only considers system communication frequency. Compared with the topology optimization method that only considers convergence rate of the consensus algorithm, the system communication frequency is saved by 5.6%. This indicates that the proposed method is more balanced in taking into account the convergence rate of consensus algorithm and system communication frequency. Furthermore, the simulation experiments also verify the dispatch system exhibits strong robustness after topology optimization.

    • Path planning for unmanned rescue boats based on improved A* algorithm

      2025, 48(17):105-112.

      Abstract (73) HTML (0) PDF 5.59 M (55) Comment (0) Favorites

      Abstract:Aiming at the problems of the traditional a* algorithm in the path planning of unmanned rescue vehicle, such as too many nodes, low computational efficiency, long search time and unsmooth path, an improved a* path planning algorithm is proposed. The weighted heuristic function is optimized and the neighborhood search strategy is improved to effectively reduce the search nodes and search time while ensuring the optimal path; Bessel curve is used to smooth the path, optimize the smoothness and stability of the path, reduce the vibration of the rescue boat, and improve the efficiency and safety of the movement. Experimental results show that compared with the traditional a* algorithm, the improved a* algorithm reduces the number of search nodes by about 34.3%, 56.9% and 66.8%, and shortens the search time by 47.5%, 68.9% and 79.3%, respectively. This optimization greatly improves the efficiency and search speed of path planning, making it more suitable for the path planning task of unmanned rescue vehicles in complex environments.

    • Abnormal epileptic signal detection and classification model based on deep learning

      2025, 48(17):113-124.

      Abstract (70) HTML (0) PDF 6.64 M (48) Comment (0) Favorites

      Abstract:Epilepsy is a common neurological disease, and its diagnosis mainly relies on the analysis of EEG signals. In recent years, deep learning-based methods have been widely used in epilepsy detection, but these methods usually rely on a single feature extraction technique and mostly ignore the spatial domain features of EEG signals. In order to capture the spatial domain features of EEG signals, researchers have tried to introduce the graph representation of EEG and combine it with GNN model for modeling. However, the graph representation of existing methods usually requires each vertex to traverse all other vertices to build the graph structure, resulting in high time complexity and difficulty in meeting the needs of clinical real-time diagnosis. In response to the above challenges, this study proposed CNG structure, which reduces redundant edges by dynamically selecting neighbor nodes, significantly reducing the time complexity while retaining key information. On this basis, we further proposed a dual-view input-based automatic epilepsy detection and classification framework, DV-SeizureNet. This framework can simultaneously learn the time, frequency, and spatial domain features of EEG signals to achieve epileptic abnormality detection and seizure classification. Experiments on the TUSZ dataset show that DV-SeizureNet achieves an accuracy of 91.4% in epilepsy detection tasks, which is 2.1% better than the existing state-of-the-art methods. In the classification task, the average classification accuracy of the model for four types of epileptic seizures is 82.8%, and the F1-score is 81.2%. DV-SeizureNet uses a dual-view learning framework to comprehensively extract and fuse the spatiotemporal and frequency domain features of EEG signals, and performs well in epilepsy abnormality detection and seizure classification tasks, providing a reliable auxiliary tool for clinical diagnosis.

    • RFID phased array intelligent positioning method based on edge computing in complex environment

      2025, 48(17):125-131.

      Abstract (65) HTML (0) PDF 5.91 M (49) Comment (0) Favorites

      Abstract:To enhance the accuracy, robustness, and real-time performance of RFID positioning technology in complex environments, this paper proposes a multi-node edge computing collaborative RFID phased array intelligent positioning method. The proposed method employs phased array antennas for dynamic beam control and utilizes multi-node edge computing to process large-scale tag data, effectively mitigating the impact of multipath effects and signal attenuation. Additionally, the system integrates the Asynchronous Advantage Actor-Critic reinforcement learning algorithm to dynamically optimize positioning parameters in response to environmental changes, further improving adaptability and stability. Experiments were conducted in both standard and complex environments, with the latter simulating extensive metallic shelving, multipath effects, and dynamic interference sources to evaluate positioning error and accuracy in comparison with RSSI and TDOA methods. Experimental results show that in the standard environment, the proposed method achieves positioning errors of 0.8~0.9 meters and an accuracy of 92%; in the complex environment, errors remain within 1 meter, with accuracy exceeding 90%, significantly outperforming traditional methods. Furthermore, practical deployment in an intelligent warehouse asset management system demonstrates the high precision and robustness of the proposed method, improving inventory accuracy from 85% to 96% while reducing the misjudgment rate to 1.5%. This research provides reliable technical support for the application of RFID positioning technology in smart cities, power grid asset management, and logistics warehousing, demonstrating excellent environmental adaptability and high-efficiency positioning capabilities.

    • >Information Technology & Image Processing
    • Detection and counting method for underwater crabs based on YOLO-Crab and the improved DeepSORT

      2025, 48(17):132-141.

      Abstract (75) HTML (0) PDF 10.00 M (67) Comment (0) Favorites

      Abstract:To realize accurate feeding of unmanned aquaculture vessels in freshwater ponds, a river crab counting method with YOLO-Crab + improved DeepSORT is developed. First, to address the problems of blurring and low contrast of underwater river crab images, a river crab detection model YOLO-Crab based on YOLOv8 under the preprocessing of CLAHE is proposed.YOLO-Crab adds the coordinate attention mechanism in the backbone to improve the detection precision, and, at the same time, reduces the model magnitude by SimSPPF pooling and GSConv+Slim Neck design to mitigate the model magnitude. The improved DeepSORT algorithm replaces IOU matching with DIOU matching to solve the problem of river crab ID jumping caused by aquatic grass occlusion. Experiments show that the detection precision and F1 of YOLO-Crab model reach 97.3% and 94%, respectively, and the average precision of counting methods is 81%. At the same time, the model was transplanted to Jeston AGX Orin, and the detection accuracy reached 95%, the detection speed was 60 fps, an increase of 50%, and the counting accuracy was 78%, which can provide a reliable basis for accurate feeding of unmanned aquaculture vessels.

    • Diff-2sIR: Diffusion-based refinement two-stage image restoration model

      2025, 48(17):142-150.

      Abstract (71) HTML (0) PDF 8.71 M (55) Comment (0) Favorites

      Abstract:In recent years, significant progress has been made in the field of image generation, but the consistency between the repaired and unmodified regions remains a common challenge in image inpainting tasks. This paper proposes a two-stage image inpainting model based on diffusion models (Diff-2sIR) to enhance the consistency between the repaired and unmodified regions, thereby improving the overall quality of image inpainting. Based on the theory of diffusion models, a two-stage inpainting framework is designed. By improving the U-Net architecture and the diffusion model sampling algorithm, the initial inpainting results are further refined in a second stage, alleviating the inconsistency between the repaired and unmodified regions. In the face inpainting task on the CelebA-HQ dataset, the Diff-2sIR model achieves the best FID score (2.92), significantly improving the inpainting quality. Experimental results show that the model further refines the inpainting results based on the guidance module, demonstrating exceptional performance. The Diff-2sIR model effectively addresses the inconsistency between the repaired and unmodified regions, providing a new solution for image inpainting tasks, with significant theoretical and practical implications.

    • Foreign object detection for underground coal mine conveyor belts in occlusion scenarios based on SDGW-YOLOv11

      2025, 48(17):151-159.

      Abstract (145) HTML (0) PDF 5.60 M (52) Comment (0) Favorites

      Abstract:To address the issues of missed and false detections caused by occlusions and scale variations of foreign objects such as large gangue stones and anchor rods on underground coal mine conveyor belts, an improved detection model, SDGW-YOLOv11, is proposed. First, to achieve effective detection of occluded objects through multi-perspective feature fusion and consistency regularization, and to extract features from multiple positions and scales, the SEAM attention mechanism is introduced into the neck network of YOLOv11. This mechanism reduces the interference caused by occlusion during detection. Second, to enhance the model′s adaptability to the size variations of objects, both occluded and unoccluded, the C3k2_DCN module is designed and integrated into the backbone network of YOLOv11, improving the model′s local perception capability for objects. Finally, to prevent the attention mechanism from significantly increasing the model size and affecting detection speed, the model is optimized by replacing some conventional convolutional layers with GhostConv to reduce the number of parameters and adopting the WIoU loss function to replace the original loss function, thereby accelerating convergence.Experimental results show that the SDGW-YOLOv11 model achieves a detection accuracy of 86.1%, representing a 4.6% improvement over the original model. The optimized model achieves a detection speed of 82 fps second (FPS), fully meeting the requirements for real-time detection of conveyor belt foreign objects. The improved model outperforms Faster R-CNN, SSD, YOLOv3, YOLOv5, YOLOv7, YOLOv8, YOLOv9, YOLOv10, and YOLOv11 in both precision and mAP@0.5, effectively reducing missed and false detections caused by occlusion and scale variation. It is better suited for foreign object detection in underground coal mine conveyor belt scenarios.

    • YOLOv8 aerial image detection algorithm with multi-path feature fusion

      2025, 48(17):160-168.

      Abstract (55) HTML (0) PDF 15.84 M (58) Comment (0) Favorites

      Abstract:To address the challenge of low detection accuracy for small objects in drone aerial images due to dense targets and complex backgrounds, we proposed MF-YOLO. First, the multi-path feature fusion capability is enhanced to integrate features from different layers, preserving shallow details and improving small object detection accuracy. Second, the EMA attention mechanism is adopted to improve the recognition rate of target regions and the accuracy, effectively distinguishing targets from background regions. Then, a Dense Attention Layer (DAL) is introduced to enhance the algorithm′s feature extraction capability in dense regions by focusing on these areas and suppressing irrelevant features. Next, a Squeeze-and-Excitation detection head is designed, incorporating the SE attention mechanism to suppress redundant features and further improve small object detection accuracy. Finally, a video dataset is constructed, and a target detection system is designed to visualize the algorithm′s detection performance. Experimental validation on the VisDrone2019 dataset shows that MF-YOLO achieves a mAP0.5 of 30.3%, a 3.4% improvement compared to the YOLOv8n baseline algorithm. The results demonstrate that the algorithm significantly improves object detection performance in drone images and has broad application prospects.

    • Underwater image enhancement network based on multi-scale residual fusion

      2025, 48(17):169-177.

      Abstract (58) HTML (0) PDF 14.32 M (51) Comment (0) Favorites

      Abstract:The existence of blue-green bias, low clarity and contrast of underwater images seriously affects the accuracy and reliability of underwater research. To address the above problems, this paper proposes an underwater image enhancement network based on multi-scale residual fusion. Firstly, a multi-scale channel feature extraction module MSCFE is proposed. The MSCFE module models each channel independently to avoid information interference between channels, and at the same time, channel attention is introduced to enhance the key features to effectively enhance the colour and details. Then, a global-local colour correction module GLCC is proposed, and the GLCC module adopts two branches, local and global, to model the local colour details and long-range dependencies respectively to correct the image colour. The experiments show that on the UIEB dataset, the structural similarity of the enhanced image reaches 0.937 8, the peak signal-to-noise ratio reaches 23.768 7, the underwater colour image quality evaluation index reaches 0.568 9, and the image information entropy reaches 7.572 3; on the EUVP dataset, the structural similarity of the enhanced image reaches 0.910 5, the peak signal-to-noise ratio reaches 25.169 9, underwater colour image quality evaluation index reached 0.525 3, and image information entropy reached 7.347 9, which are better than other mainstream methods.

    • Crack detection method of underwater pipe pile based on YOLO lightweight

      2025, 48(17):178-187.

      Abstract (57) HTML (0) PDF 12.96 M (52) Comment (0) Favorites

      Abstract:In order to solve these problems, this paper proposes an automatic identification method for pipe pile cracks based on pipe pile cleaning robots. A lightweight network detection algorithm YOLOv8-MLLA-Mobilenetv4-WIoU(MWM-YOLO) was designed. Capture low-quality defect images in a muddy water environment and augment the data to expand the dataset. For low-quality images under muddy water, in view of the suppression effect caused by the mismatch between image enhancement and object detection, MLLA is used to accurately focus on key feature areas, which can effectively suppress background interference while maintaining high-resolution output, so as to enhance the synergy between image enhancement and object detection. At the same time, the latest Mobilenetv4 backbone network is used to reduce the number of parameters and calculations of the characteristic network. On this basis, considering that low-quality image data annotation inevitably contains low-quality examples, the WIoU loss function is used to replace the loss function in the original YOLOv8 network model to improve the generalization performance of the model. The experimental results show that the weight of the MWM-YOLO model is 14.9 MB, which is 30.3% less than that of the original model. The average accuracy reached 89.1%, and the inference speed was 137.54 fps, which was better than other models. Compared with the original network, the improved network model can be lightweight deployed to edge computing devices while maintaining the accuracy of defect identification, providing technical support for underwater pipe pile cleaning robots.

    • Helmet detection algorithm in complex scenarios based on improved YOLOv8

      2025, 48(17):188-198.

      Abstract (84) HTML (0) PDF 15.04 M (66) Comment (0) Favorites

      Abstract:In order to solve the problem of missing detection and false detection in the helmet wearing detection model in complex construction scenes due to dense personnel, occlusion and small target size, this paper proposes an improved YOLOv8 based helmet wearing detection algorithm. Firstly, the CMUNeXtBlock module based on large core depth-separable convolution is introduced to improve the global awareness of the network by combining depth-separable convolution with reverse bottleneck technology. Secondly, the C2FICB module is designed to replace the C2f in the backbone network and integrate the semantic features between different channels and spatial locations to strengthen the network′s multi-scale generalization. Moreover, P2 micro-scale target detection layer is designed in the neck network to improve the network′s ability to capture local features. Finally, a RFAConv head(RFAHead) detection head based on the convolution of receptive field attention is proposed to optimize the expression of spatial features and further strengthen the ability of the model to extract global features. Experimental results show that in the Safety helmet dataset, the value of the improved model is increased by 5.2% and that of mAP@0.5-0.95 by 3.9% compared with the baseline model, respectively, effectively improving the accuracy of the safety helmet wearing detection model.

Editor in chief:Prof. Sun Shenghe

Inauguration:1980

ISSN:1002-7300

CN:11-2175/TN

Domestic postal code:2-369

  • Most Read
  • Most Cited
  • Most Downloaded
Press search
Search term
From To