全局与局部属性一致的图像修复模型

2023-05-06 10:05| 来源: 网络整理| 查看: 265

Objective Image inpainting is a hot research topic in computer vision. In recent years, this task has been considered a conditional pattern generation problem in deep learning that has received much attention from researchers. Compared with traditional algorithms, deep-learning-based image inpainting methods can be used in more extensive scenarios with better inpainting effects. Nevertheless, these methods have limitations. For instance, their image inpainting results need to be improved in terms of semantic rationality, structural coherence, and detail accuracy when processing the close association among global and local attributed images, especially when dealing with images involving a large defect area. This paper proposes a novel image inpainting model based on the fully convolutional neural network and the idea of generative adversarial network to solve the above problems. This model optimizes the network structure, loss constraints, and training strategies to obtain improved image inpainting effects. Method First, this paper proposes a novel image inpainting network as a generator to repair defective images by using effective methods in the field of image processing. A network framework based on a fully convolutional neural network is then built in the form of an encoder-decoder. For instance, we replace part of convolutional layers in the network decoding stage with dilated convolution. We also apply dilated convolution superposition with multiple dilation rates to obtain a larger input image area compared with ordinary convolution in small-size feature graphs and then effectively increase the receptive field of the convolution kernel without increasing the calculation amount to develop a better understanding of images. We also set long-skip connections in the corresponding stage of encoding-decoding. This connection strengthens the structural information by transmitting low-level features to the decoding stage. The setting enhances the correlation among deep features and reduces the difficulties in network training. Second, we introduce structural similarity (SSIM) as the reconstruction loss of image inpainting. This image quality evaluation index is built from the perspective of the human visual perception system and differs from the common mean square error (MSE) loss per pixel. This index comprehensively evaluates via an experiment the similarity between two images in their brightness, contrast, and structure. Structural similarity, as the reconstruction loss of an image, can effectively improve the visual effects of image inpainting results. We use the improved global and local context discriminator as a two-way discriminator to determine the authenticity of the inpainting results. The global context discriminator guarantees the consistency of attributes between the image inpainting area and the entire image, whereas the local context discriminator improves the detailed performance of the image inpainting area. Combined with adversarial loss, this paper proposes a joint loss to improve the performance of the model and reduce the difficulties in its training. By drawing lessons from the training mode of generative adversarial networks, we presents a novel method to alternately train image inpainting network and image discriminative network, which obtains an ideal result. In practical applications, we only use image inpainting network to repair defective images. Result To verify the effectiveness of the proposed image inpainting model, we compare the image inpainting effect of this model with that of mainstream image inpainting algorithms on the CelebA-HQ dataset by using subjective perception and objective indicators. To achieve the best inpainting effect in controlled experiments, we use official versions of codes and examples. The image inpainting result is taken from loading pre-training files or online demos. We place the specific defect mask onto 50 randomly selected images as test cases and then apply different image inpainting algorithms to repair and collect statistics for the comparison. The CelebA-HQ dataset is a cropped and super-resolution reconstructed version of the CelebA dataset, which contains 30 000 high-resolution face images. The human face represents a special image that not only contains specific features but also an infinite amount of details. Therefore, face images can fully test the expressiveness of the image inpainting method. Considering the algorithm consistent attribute of the global and local images in the controlled experiment, experiment results show that the image inpainting model demonstrates some improvements in its semantic rationality, structural coherence, and detail performance compared with other algorithms. Subjectively, this model has a natural edge transition and a very detailed image inpainting area. Objectively, this model has a peak signal-to-noise ratio(PSNR), and SSIM of 31.30 dB and 90.58% on average, respective, both of which exceed those of mainstream deep learning-based image inpainting algorithms. To verify its generality, we test the image inpainting model on the Places2 dataset. Conclusion This paper proposes a novel image inpainting model that shows improvements in terms of network structure, cost, training strategy, and image inpainting results. This model also provides a better understanding of the high-level semantics of images. Given its highly accurate context and details, the proposed model obtains better image inpainting results from human visual perception. We will continue to improve the effect of image inpainting and explore the conditional image inpainting task in the future. Our plan is to improve and optimize this model in terms of network structure and loss constraint to reduce losses in an image during the feature extraction process under a controllable network training setup. We shall also try to make the defect mask do more work with channel domain attention mechanism to further improve the quality of image inpainting. We also plan to analyze the relationship between image boundary structure and feature reconstruction. We aim to improve the convergence speed of network training and the quality of image inpainting by using an accurate and effective loss function. Furthermore, we would use human-computer interaction or presupposed condition to affect the results of image inpainting, which explores more practical values of the model.

【本文地址】

公司简介

联系我们