Abstract:In recent years, significant progress has been made in the field of image generation, but the consistency between the repaired and unmodified regions remains a common challenge in image inpainting tasks. This paper proposes a two-stage image inpainting model based on diffusion models (Diff-2sIR) to enhance the consistency between the repaired and unmodified regions, thereby improving the overall quality of image inpainting. Based on the theory of diffusion models, a two-stage inpainting framework is designed. By improving the U-Net architecture and the diffusion model sampling algorithm, the initial inpainting results are further refined in a second stage, alleviating the inconsistency between the repaired and unmodified regions. In the face inpainting task on the CelebA-HQ dataset, the Diff-2sIR model achieves the best FID score (2.92), significantly improving the inpainting quality. Experimental results show that the model further refines the inpainting results based on the guidance module, demonstrating exceptional performance. The Diff-2sIR model effectively addresses the inconsistency between the repaired and unmodified regions, providing a new solution for image inpainting tasks, with significant theoretical and practical implications.