如何评价商汤科技正式开源 mmcv 和 mmdetection？

#如何评价商汤科技正式开源 mmcv 和 mmdetection？| 来源: 网络整理| 查看: 265

主要是有三个部分组成，其实 open-mmlab 将整个模型训练和验证过程进行抽象，如果不了解深度学习的模型训练和验证基本流程，可能本分享会对你有点陌生。

训练配置

我们要训练自己模型首先在工具集 tools 文件夹下找到 train.py 文件，将其复制一份，自己起一个名字用于训练自己模型,好处是不会动原有 train.py 文件。然后在这个文件中进行修改

选择模型 parser.add_argument('--config', default='../configs/xxx/deeplabv3_r50-d8_512x1024_40k_xxx.py',help='train config file path')

首先将 config 变为 --config 这样做就是将 config 将参数从必填变为可选，然后在 configs 文件夹下建立一个自己文件夹将先要使用配置同时复制一份放到这个文件夹下 xxx 同样是便于区分 deeplabv3_r50-d8_512x1024_40k_xxx.py 如果自定义不多，或者没有用自己数据集，对于配置文件这样做就够了

deeplabv3_r50-d8_512x1024_40k_ 这里 deeplabv3 表示选择网络,r50 表示主干网络，这里表示 resNet 50 d8 暂时不清楚，512x1024 输入图片的尺寸 40k 表示训练的轮数

指定工作目录parser.add_argument('--work-dir',default="../tools/data/sample/results",help='the dir to save logs and models')

这里保存一些日志文件或者阶段性的训练 checkpoints 文件。

指定加载预训练模型parser.add_argument( '--load-from',default="../checkpoints/deeplabv3_r50-d8_512x1024_40k_cityscapes_20200605_022449-acadc2f8.pth",help='the checkpoint file to load weights from')

注意这里 deeplabv3_r50-d8_512x1024_40k_cityscapes 这个文件需要给模型配置文件对应上，

模型配置

然后我们简单看一看

_base_ = [ '../_base_/models/deeplabv3_r50-d8_xxx.py',#指定一个自定义的 model 配置基础文件 '../_base_/datasets/cityscapes_xxx.py',#指定一个自定义的 data 配置文件 '../_base_/default_runtime.py', '../_base_/schedules/schedule_20k_xxx.py'#指定 schedule 文件 ]

我们配置文件中主要包括模型，在模型配置中，可以配置一些模型结构相关的参数。然后再数据库中可以指定一些数据相关参数以及 schedule 训练过程的一些相关的参数

接下里我们具体看一看这几个文件

首先看看模型文件

deeplabv3_r50-d8_xxx.py

# model settings norm_cfg = dict(type='BN', requires_grad=True) model = dict( type='EncoderDecoder', pretrained='open-mmlab://resnet50_v1c', backbone=dict( type='ResNetV1c', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), dilations=(1, 1, 2, 4), strides=(1, 2, 1, 1), norm_cfg=norm_cfg, norm_eval=False, style='pytorch', contract_dilation=True), decode_head=dict( type='ASPPHead', in_channels=2048, in_index=3, channels=512, dilations=(1, 12, 24, 36), dropout_ratio=0.1, num_classes=11,#修改类别数 norm_cfg=norm_cfg, align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), auxiliary_head=dict( type='FCNHead', in_channels=1024, in_index=2, channels=256, num_convs=1, concat_input=False, dropout_ratio=0.1, num_classes=11,#修改类别数 norm_cfg=norm_cfg, align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)), # model training and testing settings train_cfg=dict(), test_cfg=dict(mode='whole'))

这个文件是网络架构配置，type 是用 register 注册过类，根据 type 可以找到对应的类，也可以自己定义模型后使用 Registor 进行注册后，后面跟着参数是这个模型的参数，需要修改 decode_head 和 auxiliary_head 中的 num_classes 参数可数据集中的类别保持一致。

如果单 GPU 需要将将 type='SyncBN' 修改为 type='BN' 然后还需要将类别 num_classes 修改为指定自定义数据库的类别数。

模型文件# dataset settings dataset_type = 'MyCustomDataset' data_root = 'data/sample/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) crop_size = (512, 1024) train_pipeline = [ dict(type='LoadImageFromMyCustomFile'), dict(type='LoadMyCustomAnnotations'), dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ] test_pipeline = [ dict(type='LoadImageFromMyCustomFile'), dict( type='MultiScaleFlipAug', img_scale=(2048, 1024), # img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75], flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=2,#如何设置为 1 会存在问题 workers_per_gpu=2, train=dict( type=dataset_type, data_root=data_root, img_dir='images/train', ann_dir='masks/train', pipeline=train_pipeline), val=dict( type=dataset_type, data_root=data_root, img_dir='images/train', ann_dir='masks/train', pipeline=test_pipeline), test=dict( type=dataset_type, data_root=data_root, img_dir='images/train', ann_dir='masks/train', pipeline=test_pipeline)) img_dir='images/train', ann_dir='masks/train',

指定数据集的训练数据集、验证数据集和测试数据集文件加载的路径， pipeline。

训练过程配置# 优化器 optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) optimizer_config = dict() # 学习率策略 lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) # runtime settings runner = dict(type='IterBasedRunner', max_iters=20000) # 保存 checkpoints checkpoint_config = dict(by_epoch=False, interval=500) evaluation = dict(interval=500, metric='mIoU', pre_eval=True)

【本文地址】

公司简介

联系我们