Multi-layer Perceptron Interactive Fusion Method for Infrared and Visible Images

SUN Jing; WANG Zhishe; YANG Fan; YU Zhaofa

SUN Jing, WANG Zhishe, YANG Fan, YU Zhaofa. Multi-layer Perceptron Interactive Fusion Method for Infrared and Visible Images[J]. Infrared Technology , 2025, 47(5): 619-627.

Citation:

SUN Jing, WANG Zhishe, YANG Fan, YU Zhaofa. Multi-layer Perceptron Interactive Fusion Method for Infrared and Visible Images[J]. Infrared Technology , 2025, 47(5): 619-627.

Citation:

SUN Jing, WANG Zhishe, YANG Fan, YU Zhaofa. Multi-layer Perceptron Interactive Fusion Method for Infrared and Visible Images[J]. Infrared Technology , 2025, 47(5): 619-627.

Multi-layer Perceptron Interactive Fusion Method for Infrared and Visible Images

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Existing Transformer-based fusion methods employ a self-attention mechanism to model the global dependency of the image context, which can generate superior fusion performance. However, due to the high complexity of the models related to attention mechanisms, the training efficiency is low, which limits the practical application of image fusion. Therefore, a multilayer perceptron interactive fusion method for Infrared and visible images, called MLPFuse, is proposed. First, a lightweight multilayer perceptron network architecture is constructed that uses a fully connected layer to establish global dependencies. This framework can achieve high computational efficiency while retaining strong feature representation capabilities. Second, a cascaded token- and channel-wise interaction model is designed to realize feature interaction between different tokens and independent channels to focus on the inherent features of the source images and enhance the feature complementarity of different modalities. Compared to seven typical fusion methods, the experimental results on the TNO and MSRS datasets and object detection tasks show that the proposed MLPFuse outperforms other methods in terms of subjective visual descriptions and objective metric evaluations. This method utilizes a multilayer perceptron to model the long-distance dependency of images and constructs a cascaded token-wise and channel-wise interaction model to extract the global features of images from spatial and channel dimensions. Compared with other typical fusion methods, our MLPFuse achieves remarkable fusion performance and competitive computational efficiency.

FullText(HTML)

References (29)

Cited By

Multi-layer Perceptron Interactive Fusion Method for Infrared and Visible Images

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content