PyTorch

2024-07-13 08:25| 来源: 网络整理| 查看: 265

前言

最近618入手了一台带NVIDIA MX250显卡的笔记本，由于本人希望了解CUDA方面知识，因此特意选择了带显卡的笔记本。虽然MX250是入门级独立显卡，为了学习还是够用了。显卡是采用并行计算的，在深度学习训练和推理方面相比CPU有较大的优势（瘦死骆驼比马大），因此笔者分别采用Intel Core i7 -10510U 以及NVIDIA MX250显卡基于PyTorch 测试了一些常见的CNN模型。结果: MX230入门级独立显卡大约 = I7-10510U推理性能 x 2 ~3.

Requirement Ubuntu 18.04.4 PyTorch 1.5.1 (with cuda) torchvision cuda 11.0 Device

MX250显卡

图片.png

Intel CPU 4核，8线程

图片.png 深度学习CNN模型测试 Model List AlexNet ResNet-50 ResNet-18 ResNet-101 MobileNet-v2 SqueezeNet1-1 测试方法 warn_up = 3, 热机，避免开始时候测量误差大 loops =10, 每个模型跑10次，计算平均时间分别在CPU和GPU端测试（GPU段测试时候注意CUDA同步问题， CUDA是异步执行的，因此需要在代码中加入CUDA同步）测试结果 CPU ==========AlexNet========== Avg time:30.015921592712402 ms ==========ResNet-50========== Avg time:80.41181564331055 ms ==========ResNet-18========== Avg time:31.624460220336914 ms ==========ResNet-101========== Avg time:124.81389045715332 ms ==========MobileNet-v2========== Avg time:18.62039566040039 ms ==========SqueezeNet1-1========== Avg time:15.979170799255371 ms MX250 显卡 ==========AlexNet========== Avg time:10.455155372619629 ms ==========ResNet-50========== Avg time:28.374290466308594 ms ==========ResNet-18========== Avg time:11.450338363647461 ms ==========ResNet-101========== Avg time:51.11570358276367 ms ==========MobileNet-v2========== Avg time:6.742191314697266 ms ==========SqueezeNet1-1========== Avg time:3.6443233489990234 ms 测试代码 import torch import torchvision import torchvision.models as models import time import numpy as np def test_on_device(model, dump_inputs, warn_up, loops, device_type): if device_type == 'cuda': assert torch.cuda.is_available() device = torch.device(device_type) # model = models.alexnet.alexnet(pretrained=False).to(device) model.to(device) model.eval() dump_inputs = dump_inputs.to(device) with torch.no_grad(): executions = [] for i in range(warn_up + loops): if device_type == 'cuda': torch.cuda.synchronize() start = time.time() _ = model(dump_inputs) if device_type == 'cuda': torch.cuda.synchronize() # CUDA sync end = time.time() executions.append((end-start)*1000) # ms # print(f'Avg time:{np.mean(executions)} ms') return np.mean(executions[warn_up:]) if __name__ == "__main__": # print(torch.cuda.is_available()) model_list = { 'AlexNet': models.alexnet(), 'ResNet-50': models.resnet50(), 'ResNet-18': models.resnet18(), 'ResNet-101': models.resnet101(), 'MobileNet-v2':models.mobilenet_v2(), 'SqueezeNet1-1': models.squeezenet1_1() } batch_size = 1 for name, model in model_list.items(): print('='*10+f'{name}'+'='*10) avg_time = test_on_device(model=model, dump_inputs=torch.rand(batch_size, 3, 224, 224), warn_up=3, loops=10, device_type='cuda') print(f'Avg time:{avg_time} ms')

【本文地址】

公司简介

联系我们