改进参考魔鬼导师:YOLOV7改进-添加注意力机制_哔哩哔哩_bilibili

视频教程:YOLOV7改进-添加注意力机制_哔哩哔哩_bilibili

GitHub改进项目地址:其中的cv_attentionGitHub - z1069614715/objectdetection_script: 一些关于目标检测的脚本的改进思路代码,详细请看readme.md

学习内容:

问:

涉及:

一些基本概念:通道数,卷积核大小,卷积步长:

一些基本概念_To-的博客-CSDN博客  中进行解释

 根据视频--添加注意力机制的步骤:

1.打开Yolov7/cfg/training/yolov7.yaml 模型配置文件

   yolov7.yaml配置文件中存储这yolov7模型的一百多层卷积结构,通过代码注释和模型结构图搭配理解

# parameters
nc: 1  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple

# anchors
anchors:
  - [12,16, 19,36, 40,28]  # P3/8
  - [36,75, 76,55, 72,146]  # P4/16
  - [142,110, 192,243, 459,401]  # P5/32

# yolov7 backbone
backbone:
  # [from, number, module, args]     分别表示: 输入,重复次数,名称,参数数组
  [[-1, 1, Conv, [32, 3, 1]],  # 0 ##CBS  
  
   [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2 ##CBS      
   [-1, 1, Conv, [64, 3, 1]],  # ##CBS   实例:第一个-1表示以上一层的输出作为本层的输入,-1表示向前退一层,-2表示向前退两层,以此类推
   
   [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4 ##CBS # 实例: 第二个1表示,该层操作一次
   [-1, 1, Conv, [64, 1, 1]],             # 实例: Conv表示该层操作的名称
   [-2, 1, Conv, [64, 1, 1]],   # [64, 1, 1]: 64表示输出特征图的通道数,来控制特征图的深度
   [-1, 1, Conv, [64, 3, 1]],   # [64, 3, 1]: 3表示卷积核大小,以3*3的卷积核对输入图像进行卷积,以提取图像特征
   [-1, 1, Conv, [64, 3, 1]],   # [64, 3, 1]: 1表示卷积步长,步长控制了每次卷积操作后特征图的大小变化
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1]],  # 11  ##C7_1
         
   [-1, 1, MP, []],
   [-1, 1, Conv, [128, 1, 1]],
   [-3, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 16-P3/8  ##MP-C3
   [-1, 1, Conv, [128, 1, 1]],
   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [512, 1, 1]],  # 24  ##C7_1 待插入点
         
   [-1, 1, MP, []],
   [-1, 1, Conv, [256, 1, 1]],
   [-3, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 29-P4/16  ##MP-C3
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [1024, 1, 1]],  # 37  ##C7_1  待插入点
         
   [-1, 1, MP, []],
   [-1, 1, Conv, [512, 1, 1]],
   [-3, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [512, 3, 2]],
   [[-1, -3], 1, Concat, [1]],  # 42-P5/32  ##MP-C3
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -3, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [1024, 1, 1]],  # 50  ##C7_1
  ]

# yolov7 head
head:
  [[-1, 1, SPPCSPC, [512]], # 51  ##SPPCSPC  待插入点
  
   [-1, 1, Conv, [256, 1, 1]],   ##CBS
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],  ##上采样
   [37, 1, Conv, [256, 1, 1]], # route backbone P4  ##CBS
   [[-1, -2], 1, Concat, [1]],
   
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1]], # 63  ##C7_2
   
   [-1, 1, Conv, [128, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [24, 1, Conv, [128, 1, 1]], # route backbone P3
   [[-1, -2], 1, Concat, [1]],
   
   [-1, 1, Conv, [128, 1, 1]],
   [-2, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, Conv, [64, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [128, 1, 1]], # 75  ##C7_2
      
   [-1, 1, MP, []],
   [-1, 1, Conv, [128, 1, 1]],
   [-3, 1, Conv, [128, 1, 1]],
   [-1, 1, Conv, [128, 3, 2]],
   [[-1, -3, 63], 1, Concat, [1]],
   
   [-1, 1, Conv, [256, 1, 1]],
   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, Conv, [128, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1]], # 88   ##C7_2
      
   [-1, 1, MP, []],
   [-1, 1, Conv, [256, 1, 1]],
   [-3, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [256, 3, 2]],
   [[-1, -3, 51], 1, Concat, [1]],
   
   [-1, 1, Conv, [512, 1, 1]],
   [-2, 1, Conv, [512, 1, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, Conv, [256, 3, 1]],
   [[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
   [-1, 1, Conv, [512, 1, 1]], # 101  ##C7_2
   
   [75, 1, RepConv, [256, 3, 1]],
   [88, 1, RepConv, [512, 3, 1]],
   [101, 1, RepConv, [1024, 3, 1]],

   [[102,103,104], 1, IDetect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]

相应的yolov7网络模型结构图:

在yolov7中的三个位置,如图所示:C7_1,C7_1,SPPCSPC后的三个特征层输出的位置添加注意力机制 = 等于对这三个模块用带有注意力机制的模块进行替换

在魔鬼老师的GitHub上找到要添加的注意力机制模块代码:GitHub - z1069614715/objectdetection_script: 一些关于目标检测的脚本的改进思路代码,详细请看readme.md

 

Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐