Abstract:Targets in remote sensing images are often elongated, zigzagging and other complex morphology, and accompanied by large scale changes and strong background interference and other factors, resulting in the existing detection methods are prone to lack of detection and misdetection, it is difficult to meet the demand for high-precision detection, in this regard, an improved remote sensing image target detection algorithm TriD-DETR. First, by dynamically adjusting the shape of convolutional kernel and optimizing the channel adaptation and residual connection methods, a DKFE feature extraction module is designed, which is able to adaptively focus on the elongated and zigzagging local regions, thus accurately capturing the target features; second, in order to improve the model′s ability of locating and identifying the complex targets. DATE in-scale feature interaction structure is proposed, which introduces a deformable attention mechanism on the basis of reconfiguring the Transformer encoder and enhances the model′s ability to capture high-level features and deep semantic information; finally, for the multi-scale feature fusion part, the DBFB diverse branch fusion block, which enriches the feature space by combining diverse branches of different scales and complexity, thus enhancing the expressive ability of the model. The experimental results show that the TriD-DETR algorithm achieves 86.8% and 94.1% mAP on the DIOR and RSOD datasets, respectively, which are 1.2% and 2.3% higher than the original model RT-DETR-R18, which fully proves the reliability and efficiency of the TriD-DETR algorithm.