Abstract:To fully integrate multi-dimensional emotional information from EEG signals and improve emotion recognition performance, this paper proposes a network model based on multi-attention and multi-feature fusion. The model combines the asymmetry of the brain hemispheres and the spatial, spectral, and temporal characteristics of EEG signals, performing feature extraction through parallel dual-input pathways. A parallel attention mechanism is used to enhance the expression of frequency channels and spatial information, while the size of convolution kernels is adjusted through dynamic kernel selection. Additionally, depthwise separable convolutions are employed to further extract and compress features. Finally, the temporal dependencies and global associations between features are captured through fusion in the Transformer encoding layer, enabling emotion classification. In three-class experiments on the SEED dataset, the model achieved an average accuracy of 98.53%, demonstrating the superiority of this approach. Furthermore, visual analysis of the attention module further enhances the interpretability of the model.