Abstract:Crack detection is crucial in the maintenance of civil infrastructure. The many drawbacks of traditional manual visual inspection methods have led to the continuous development of crack detection methods. However, existing crack detection techniques face the challenges of complex backgrounds and feature diversity interference, and the high computational resource requirements. This study exploits the potential of Mamba for visual tasks and proposes an UltraLight CrackNet, which consists of a parallel lightweight visual Mamba block for efficiently modelling long-distance dependencies and extracting deep semantic features, a multi-scale residual visual state space block for enhanced multi-scale feature representation, and an enhanced semantics and detail infusion module for optimising skip connections within the encoder-decoder architecture. The experimental results show that our method requires only 0.13 M parameters and 1.96 G FLOPs, and achieves the optimal performance on DeepCrack and Crack500 datasets with ultra-lightweight model design, with the mean intersection over union (mIoU) of 87.85% and 77.92%, respectively, and obtains comparable results on SteelCrack dataset, and the number of parameters is 87.85% lower than that of the model with the smallest number of parameters among the available comparison models.