目标检测中NMS的GPU实现(来自于Faster R 您所在的位置:网站首页 nms是做啥的 目标检测中NMS的GPU实现(来自于Faster R

目标检测中NMS的GPU实现(来自于Faster R

2024-06-29 16:56| 来源: 网络整理| 查看: 265

最近要修改Faster R-CNN中实现的GPU版的NMS代码,于是小白的我就看起了CUDA编程,当然也只是浅显地阅读一些教程,快速入门而已,所以具体需要注意的以及一些思想,大家移步此博主的系列教程:

在了解了CUDA编程的核心思想后,我们便可以开始阅读nms_kernel.cu文件了,先直接上源码(部分简单的已经注释),如下:

// ------------------------------------------------------------------ // Faster R-CNN // Copyright (c) 2015 Microsoft // Licensed under The MIT License [see fast-rcnn/LICENSE for details] // Written by Shaoqing Ren // ------------------------------------------------------------------ #include "gpu_nms.hpp" #include #include //cudaError_t是cuda中的一个类,用于记录cuda错误(所有的cuda函数,几乎都会返回一个cudaError_t) #define CUDA_CHECK(condition) \ /* Code block avoids redefinition of cudaError_t error */ \ do { \ cudaError_t error = condition; \ if (error != cudaSuccess) { \ std::cout col_start) return; const int row_size = min(n_boxes - row_start * threadsPerBlock, threadsPerBlock); const int col_size = min(n_boxes - col_start * threadsPerBlock, threadsPerBlock); __shared__ float block_boxes[threadsPerBlock * 5]; //共享内存 if (threadIdx.x < col_size) { block_boxes[threadIdx.x * 5 + 0] = dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0]; block_boxes[threadIdx.x * 5 + 1] = dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1]; block_boxes[threadIdx.x * 5 + 2] = dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2]; block_boxes[threadIdx.x * 5 + 3] = dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3]; block_boxes[threadIdx.x * 5 + 4] = dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4]; } __syncthreads(); //同步线程 if (threadIdx.x < row_size) { const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x; const float *cur_box = dev_boxes + cur_box_idx * 5; int i = 0; unsigned long long t = 0; int start = 0; if (row_start == col_start) { start = threadIdx.x + 1; } for (i = start; i < col_size; i++) { if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) { t |= 1ULL nms_overlap_thresh) { t |= 1ULL


【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有