前言

关于计算量(FLOPs)参数量(Params)的一个直观理解,便是计算量对应时间复杂度,参数量对应空间复杂度,即计算量要看网络执行时间的长短,参数量要看占用显存的量。

计算量: FLOPs,FLOP时指浮点运算次数,s是指秒,即每秒浮点运算次数的意思,考量一个网络模型的计算量的标准。越小越好

参数量: Params,是指网络模型中需要训练的参数总数。越小越好

在这里插入图片描述

了解以上概念后,接下来便是如何计算这两个值。
一个很常见的方法便是通过ptflos包来实现。

# -- coding: utf-8 --
import torchvision
from ptflops import get_model_complexity_info

model = torchvision.models.alexnet(pretrained=False)
flops, params = get_model_complexity_info(model, (3, 224, 224), as_strings=True, print_per_layer_stat=True)
print('flops: ', flops, 'params: ', params)

这段代码可以说是即插即用。

DAB-DETR模型

博主以DAB-DETR模型为例,运行时报错,这是由于权重文件于模型配置文件不匹配导致的

权重文件与模型配置不匹配

RuntimeError: Error(s) in loading state_dict for DABDeformableDETR:
	size mismatch for input_proj.0.0.weight: copying a param with shape torch.Size([256, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 1, 1]).
	size mismatch for input_proj.1.0.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 256, 1, 1]).
	size mismatch for input_proj.2.0.weight: copying a param with shape torch.Size([256, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]).
	size mismatch for input_proj.3.0.weight: copying a param with shape torch.Size([256, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 3, 3]).

修改num_channels的值即可,原本为【128,256,512】

  if return_interm_layers:
        # return_layers = {"layer1": "0", "layer2": "1", "layer3": "2", "layer4": "3"}
        return_layers = {"layer2": "0", "layer3": "1", "layer4": "2"}
        self.strides = [8, 16, 32]
        self.num_channels = [512, 1024, 2048]

推理代码

推理代码如下:几乎所有的DETR类模型的推理代码都是可以通用的。

import json
import os, sys
import torch
import numpy as np

from models import build_DABDETR
from models.dab_deformable_detr import build_dab_deformable_detr
from util.slconfig import SLConfig
from datasets import build_dataset
from util.visualizer import COCOVisualizer
from util import box_ops
model_config_path = "D:/graduate/others/DAB-DETR/config.json" # change the path of the model config file
model_checkpoint_path = "D:/graduate/others/DAB-DETR/checkpoint.pth" # change the path of the model checkpoint
# See our Model Zoo section in README.md for more details about our pretrained models.

args = SLConfig.fromfile(model_config_path)
model, criterion, postprocessors = build_DABDETR(args)
checkpoint = torch.load(model_checkpoint_path, map_location='cpu')
model.load_state_dict(checkpoint['model'])
_ = model.eval()
with open('util/coco_id2name.json') as f:
    id2name = json.load(f)
    id2name = {int(k): v for k, v in id2name.items()}
from PIL import Image
import datasets.transforms as T
image = Image.open("./figure/4.jpg").convert("RGB") # load image
# transform images
transform = T.Compose([
    T.RandomResize([800], max_size=1333),
    T.ToTensor(),
    T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
image, _ = transform(image, None)
from ptflops import get_model_complexity_info
model=model.to(args.device)
flops, params = get_model_complexity_info(model, (3, 224, 224), as_strings=True, print_per_layer_stat=True)
print('flops: ', flops, 'params: ', params)
# predict images
with torch.no_grad():
    output = model.cuda()(image[None].cuda())
  # visualize outputs
output = postprocessors['bbox'](output, torch.Tensor([[1.0, 1.0]]).cuda())[0]
thershold = 0.5  # set a thershold
vslzr = COCOVisualizer()
scores = output['scores']
print(len(scores))
labels = output['labels']
boxes = box_ops.box_xyxy_to_cxcywh(output['boxes'])
select_mask = scores > thershold

box_label = [id2name[int(item)] for item in labels[select_mask]]
pred_dict = {
      'boxes': boxes[select_mask],
      'size': torch.Tensor([image.shape[1], image.shape[2]]),
      'box_label': box_label
}

vslzr.visualize(image, pred_dict, savedir=None, dpi=120)

DN-DETR模型

DN-DETR模型推理代码与DAB-DETR模型推理代码大同小异,但问题却不尽相同。

空值问题

indicator0 = torch.zeros([num_queries * num_patterns, 1]).cuda()
TypeError: unsupported operand type(s) for *: 'int' and 'NoneType'

空值问题,给num_patterns赋值=1即可

CPU与GPU运算问题

boxes = boxes * scale_fct[:, None, :]
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

数据有的在cpu上,有的在gpu上,在boxes = boxes * scale_fct[:, None, :]后面加上.cuda()

tuple转换问题

此外,还会报错tuple的转换问题

TypeError: tuple indices must be integers or slices, not str

将下面的代码

out_logits, out_bbox = outputs['pred_logits'], outputs['pred_boxes']

改为:

out_logits=outputs[0]['pred_logits']
out_bbox = outputs[0]['pred_boxes']

参数量计算问题

至此,DN-DETR模型推理代码修改无误,但在计算参数量时却出现问题:

File "D:\Anaconda\envs\deformable_detr\lib\site-packages\ptflops\pytorch_ops.py", line 162, in multihead_attention_counter_hook
    q, k, v = input
ValueError: not enough values to unpack (expected 3, got 2)

这里可以看到报错是参数数量出现了问题,我们找到原来的代码,将q, k, v = input改为:

q, k= input, v=k

GPU与CPU运算问题

同样的,这里也报了数据计算位置不一致的问题,如法炮制即可。

 File "E:\graduate\papers\DN-DETR\DN-DETR-main\models\DN_DAB_DETR\DABDETR.py", line 458, in forward
    boxes = boxes * scale_fct[:, None, :]
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

DN-DAB-Deformable-DETR模型

参数量运算问题

由于DN-DAB-Deformable-DETR与DN-DAB-DETR共用一套代码,这里出了问题。

    q, k= input
ValueError: too many values to unpack (expected 2)

我们查看一下input的长度,共有三个值,那么原本的写法就没有问题了,改为原本写法即可。

q, k, v= input

报错batch-size问题,其实很好解决,因为我们只是推理,只有一张图片,那么只需要设置为1即可。

至此,DETR类模型推理与计算量,参数量计算解决了。

YOLO模型计算

随后便是YOLO模型,其计算方式类似,原本博主将上面的代码直接拿过来用,但发现却出问题了。
参数量始终为0,这让我百思不得其解。

在这里插入图片描述

随后博主换了另一个工具包。

from thop import profile
print('==> Building model..')
input = torch.randn(1, 3, 224,224)
input = input.cuda()
flops, params = profile(model, (input,))
print('flops: %.2f M, params: %.2f M' % (flops / 1e6, params / 1e6))

就OK了,与DETR模型一样,我们将其放到模型推理代码中直接就可以了。

Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐