Abstract:To address the technical challenges of dense target adhesion and occlusion-prone small objects in industrial pelletized ore image segmentation, this study proposes a instance segmentation method (YO-SAM2) integrating YOLOv11 and SAM2. Firstly, the CSC module is introduced to improve the C3k2 module in YOLOv11, enhancing the network′s capability to represent features of densely clustered small targets. Second, a Small-Target Hybrid Fusion Feature Pyramid Network (SHFPN) is designed to augment feature map outputs at the P2 layer for fine-grained detail capture, incorporating cross-layer interactions and a content-guided attention mechanism to optimize multi-scale feature fusion. Additionally, a Decoupled Spatial-Channel Upsampling module (DSCU) is proposed to replace conventional upsampling, generating more discriminative feature representations. Finally, parameter-efficient fine-tuning of the SAM2 segmentation model is achieved via a learnable Adapter, significantly improving adaptability and generalization in industrial scenarios. Experimental results demonstrate that YO-SAM2 achieves a state-of-the-art mIoU of 90.3% on the pelletized ore dataset, outperforming mainstream segmentation algorithms such as Mask R-CNN and YOLOv8-seg. This method effectively resolves the challenges of accuracy and robustness in industrial pellet segmentation, offering a reliable technical solution for intelligent industrial quality inspection.