Abstract:To address the limited robustness of traditional Siamese networks in complex scenarios and the high computational demands of Transformer-based architectures, this paper proposes a novel object tracking algorithm based on supervised feedback and a lightweight Transformer. First, a supervised feedback module is designed to incorporate task-relevant feedback during feature extraction, guiding the network to focus more effectively on target regions, thereby enhancing feature discriminability and suppressing background interference. Second, a lightweight Transformer structure is constructed, which maintains strong global modeling capabilities while significantly reducing computational complexity and parameter overhead, achieving a favorable balance between performance and efficiency. Finally, an adaptive template update mechanism is introduced to dynamically adjust the template content based on the current frame′s object state and scene variations, improving adaptability to target appearance changes and mitigating tracking drift. Experimental results on multiple mainstream public datasets show that the proposed method outperforms existing advanced algorithms in terms of both robustness and real-time performance.