超分辨率重建测试（ESRGAN）

2024-06-16 03:46| 来源: 网络整理| 查看: 265

测试链接：GitHub - xinntao/BasicSR: Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.

上面这个链接里提供了很多模型，这里简单测试一下ESRGAN的实际效果，本人使用500张1024*1024的人脸数据对模型进行了测试

0.环境

先用自己的环境测试，不行了再参照我的

python3.8.12

Package Version Location ----------------------- -------------- -------------------- absl-py 0.14.1 addict 2.4.0 basicsr 1.3.4.4 /home/zc/wcs/BasicSR cachetools 4.2.4 certifi 2021.10.8 charset-normalizer 2.0.7 click 8.0.3 cycler 0.10.0 dataclasses 0.6 future 0.18.2 google-auth 2.3.0 google-auth-oauthlib 0.4.6 grpcio 1.41.0 idna 3.3 imageio 2.9.0 kiwisolver 1.3.2 lmdb 1.2.1 Markdown 3.3.4 matplotlib 3.4.3 networkx 2.6.3 numpy 1.21.2 oauthlib 3.1.1 opencv-contrib-python 4.3.0.38 opencv-python 4.5.3.56 Pillow 8.4.0 pip 21.2.4 protobuf 3.18.1 psutil 5.8.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pyparsing 2.4.7 python-dateutil 2.8.2 PyWavelets 1.1.1 PyYAML 6.0 requests 2.26.0 requests-oauthlib 1.3.0 rsa 4.7.2 scikit-image 0.18.3 scipy 1.7.1 setuptools 58.0.4 six 1.16.0 tb-nightly 2.8.0a20211015 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.0 tifffile 2021.10.12 torch 1.9.1+cu111 torchaudio 0.9.1 torchvision 0.10.1+cu111 tqdm 4.62.3 typing-extensions 3.10.0.2 urllib3 1.26.7 Werkzeug 2.0.2 wheel 0.37.0 yapf 0.31.0

1.数据准备

训练数据主要是通过resize得到，首先是将1024大小的图像resize成512作为训练时的原始图像，其中resize选择最近邻元方式，这么做是因为1024太大算力有限，低分辨率图像也是通过resize得到，大小为128*128，并且利用高斯模糊降低图像质量，具体如下：

高分辨率训练图像（512*512）

低质量低分辨率模拟训练图像（128*128）

说明：人为降低图像质量和自然条件下的低质量肯定是由一定区别的，这里因为没有数据，所以只能先这样测试了

数据集来自这里：新数据集

训练数据目录

face_sub中放入的是512*512图像，face_X2_sub中放入128*128图像

GT中放入的是512*512图像，LR放入128*128图像，和上面是一样的，用于训练时的验证

2.修改配置文件

因为用的ESRGAN，找到对应的配置文件，在BasicSR/options/train/ESRGAN/train_ESRGAN_x4.yml，需要注意的是配置文件里scale参数那个4对应了代码中对训练图像的要求，也就是高质量图像与低质量图像之间尺寸大小是4倍的关系（512/128=4）

需要更改的主要地方就是路径，dataroot_gt放高质量图像路径，dataroot_lq放低质量图像路径，pretrain_network_g放预训练模型，作者有提供的，我的训练配置如下：

# general settings name: 052_ESRGAN_x4_f64b23_DIV2K_400k_B16G1_051pretrain_wandb model_type: ESRGANModel scale: 4 num_gpu: 1 # set num_gpu: 0 for cpu mode manual_seed: 0 # dataset and data loader settings datasets: train: name: DIV2K type: PairedImageDataset dataroot_gt: /home/zc/wcs/BasicSR/datasets/train/face_sub dataroot_lq: /home/zc/wcs/BasicSR/datasets/train/face_X2_sub # (for lmdb) # dataroot_gt: datasets/DIV2K/DIV2K_train_HR_sub.lmdb # dataroot_lq: datasets/DIV2K/DIV2K_train_LR_bicubic_X4_sub.lmdb filename_tmpl: '{}' io_backend: type: disk # (for lmdb) # type: lmdb gt_size: 128 use_flip: true use_rot: true # data loader use_shuffle: true num_worker_per_gpu: 6 batch_size_per_gpu: 8 dataset_enlarge_ratio: 100 prefetch_mode: ~ val: name: Set14 type: PairedImageDataset dataroot_gt: /home/zc/wcs/BasicSR/datasets/val/GT dataroot_lq: /home/zc/wcs/BasicSR/datasets/val/LR io_backend: type: disk # network structures network_g: type: RRDBNet num_in_ch: 3 num_out_ch: 3 num_feat: 64 num_block: 23 network_d: type: VGGStyleDiscriminator128 num_in_ch: 3 num_feat: 64 # path path: pretrain_network_g: experiments/ESRGAN_PSNR_SRx4_DF2K_official-150ff491.pth strict_load_g: true resume_state: ~ # training settings train: ema_decay: 0.999 optim_g: type: Adam lr: !!float 1e-4 weight_decay: 0 betas: [0.9, 0.99] optim_d: type: Adam lr: !!float 1e-4 weight_decay: 0 betas: [0.9, 0.99] scheduler: type: MultiStepLR milestones: [50000, 100000, 200000, 300000] gamma: 0.5 total_iter: 400000 warmup_iter: -1 # no warm up # losses pixel_opt: type: L1Loss loss_weight: !!float 1e-2 reduction: mean perceptual_opt: type: PerceptualLoss layer_weights: 'conv5_4': 1 # before relu vgg_type: vgg19 use_input_norm: true range_norm: false perceptual_weight: 1.0 style_weight: 0 criterion: l1 gan_opt: type: GANLoss gan_type: vanilla real_label_val: 1.0 fake_label_val: 0.0 loss_weight: !!float 5e-3 net_d_iters: 1 net_d_init_iters: 0 # validation settings val: val_freq: !!float 5e3 save_img: true metrics: psnr: # metric name, can be arbitrary type: calculate_psnr crop_border: 4 test_y_channel: false # logging settings logger: print_freq: 100 save_checkpoint_freq: !!float 5e3 use_tb_logger: true wandb: project: ~ resume_id: ~ # dist training settings dist_params: backend: nccl port: 29500

3.训练

以上准备好后就可以训练了，根据作者BasicSR/TrainTest.md at master · xinntao/BasicSR · GitHub

中的说明，本人只有一张显卡，使用CUDA_VISIBLE_DEVICES=0 python basicsr/train.py -opt options/train/ESRGAN/train_ESRGAN_x4.yml，默认迭代40万次，训练模型及过程记录是自动创建的在下面文件中

4.预测

预测脚本有专门的，在inference/inference_esrgan.py，只需要修改对应的路径就好了，模型，输入路径，输出路径

5.结果

高质量原图1024*1024 （原始）

高质量测试图原图（512*512）

低质量测试图+高斯模糊 128*128

ESRGAN预测结果（512*512）

低质量测试图+高斯模糊 512*512 （单纯通过resize变大图像是无法改变图像清晰度的）

对比发现还是有细节上的差异，但是这个数据是很正面，且模拟数据也只有高斯模糊，我尝试用网上随便一张图测试，发现效果不好，可能是因为图不是这样很规整的正面照，本身样本里也没有类似的，还需要继续探索

原图，测试图，结果图（另外一个类似的数据集）

拿外国人像测试发现还是比较明显的，尤其是人眼部分的重建

上面是在网络上随便找的一张截图下来测试的，很明显，这个重建就很差了，但我不觉得是模型的问题，因为数据本身也只是我人为模拟的，况且在人为模拟的数据集上表现还是很正常的，我觉得搜集一下网络上的图片，然后用人造数据的模型做基础应该会表现好点。另外训练这个模型，参数我没有修改，不知道是否合适，这也是因素之一，因此，这还需要继续探索

【本文地址】

公司简介

联系我们