Abstract:In order to address the issue of the neural radiation field rendering results being overly smooth when sparse viewpoint input conditions are present, resulting in a lack of detail, a network model based on an information attention suppression module and a two-stage loss function has been proposed. The first step is to propose an information attention suppression module, which uses a feature vector normalization module to filter outliers in the weights between layers of MLP. It also uses a residual network to cascade global and local information and employs channel attention to differentiate fused information based on its degree of importance. This process improves the accuracy of the sampling points′ feature vectors. To address the issue of low perceptual accuracy resulting from overly smooth rendering, a two-stage loss function is proposed. This function partitions the training phase into two stages. In the initial coarse stage, training is guided by RGB and depth loss. Subsequently, in the fine stage, perceptual loss and TV loss are incorporated. This approach enables the utilisation of high-level image features, thereby enhancing the image perception ability via gradual optimization. This paper′s algorithm is compared with other classical methods, and on the LLFF dataset, the quantitative results demonstrate that the overall performance reaches its optimal value, which is 1.9% superior to the performance of the sub-optimal algorithm. Furthermore, on the DTU dataset, the qualitative results indicate that the reconstruction′s completeness and detail level, as observed in Scan37, Scan55, and Scan63, are notably enhanced.