Abstract:As the core power component of aircraft, the operational reliability of aero-engines is directly related to flight safety and efficiency, and fault diagnosis of intershaft bearings is a key measure to ensure their stable operation. Aiming at the fault diagnosis problem of intershaft bearings in aero-engines, this study first analyzes the limitations of existing 1D-CNN and 1D-Transformer methods: the self-attention mechanism is susceptible to high-frequency noise and redundant information in raw vibration signals, which weakens the ability to focus on critical fault features; meanwhile, the pure Transformer architecture shows insufficient capability in capturing subtle local features. To address these issues, a Multi-Scale Time-Frequency Synergy Transformer based fault diagnosis method is proposed, which integrates multi-scale time-frequency feature extraction with the global modeling capability of the Transformer, enabling collaborative capturing of both subtle local features and global correlation features of vibration signals. Experimental results indicate that in Gaussian white noise environments (SNR from -4 dB to 4 dB), the proposed method exhibits excellent fault diagnosis performance for aero-engine intershaft bearings: both diagnostic accuracy and F1-Score are optimal, reaching 96.04% under strong noise (-4 dB) and 99.84% under weak noise (4 dB), with noise-resistance stability superior to five benchmark methods. On the CWRU benchmark dataset, in both noise-free and noisy scenarios, it can stably identify different fault severities (including slight inner-race faults), achieving 99.01% accuracy under strong noise (-4 dB) and 99.78% under weak noise (4 dB), thereby demonstrating its strong generalization capability. In conclusion, the proposed MSTFS-Transformer effectively alleviates the insufficient feature-focusing and weak local feature-capturing problems under noise interference, providing an efficient and robust solution for aero-engine intershaft bearing fault diagnosis. Its strong noise immunity and accurate identification capability meet the demands of complex vibration environments in practical engineering, and offer solid technical support for improving fault-monitoring reliability.