Pytorch运行错误：CUDA out of memory处理过程

王大渣

91084人浏览 · 2020-03-31 17:03:13

王大渣 · 2020-03-31 17:03:13 发布

1 初始报错

CUDA out of memory. Tried to allocate 244.00 MiB (GPU 0; 2.00 GiB total capacity; 1.12 GiB already allocated; 25.96 MiB free; 1.33 GiB reserved in total by PyTorch)

需要分配244MiB，但只剩25.96MiB空闲。1.33GiB分配给了PyTorch。（不知道能不能重新分给CUDA）

2 出错相关代码

result, result_bb = run_meta_tracker(seq, img_list, init_bbox, gt=gt, savefig_dir='', display=args.display)

bbreg_feats = forward_samples(init_net, image1, bbreg_examples, out_layer='features')

feat = net(regions, out_layer=out_layer)

result = self.forward(*input, **kwargs)

x = self.features(x)

result = self.forward(*input, **kwargs)

input = module(input)

x_sq = (x**2).unsqueeze(dim=1)

3 解决方案

3.1 尝试网友给出的方法

链接：pytorch出现RuntimeError: CUDA out of memory._pursuit_zhangyu的博客-CSDN博客

说明：no_grad这个上下文管理器，在作用域内只做计算，不记录计算图

修改后：CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 2.00 GiB total capacity; 1.31 GiB already allocated; 7.96 MiB free; 1.34 GiB reserved in total by PyTorch)

未（mei）能（you）解（ruan）决（yong）

3.2 尝试减少输入图片的数量

换用了数据量更少的数据集，但仍然内存不足，我估计是模型本身就很大，原贴用的GTX 1070，我是960M，降维打击啊。

3.3 使用CUDA_VISIBLE_DEVICES限制一下使用的GPU

链接：显存充足，但是却出现CUDA error:out of memory错误 - Jisongxie - 博客园

未（mei）能（you）解（ruan）决（yong）

3.4 代码优化

（代码不是自己写的啊，这个可难办了）

3.5 缩小图片尺寸

可以确定，跟数据集大小关系不大，就是模型太大的问题。

3.6 降低计算的精度，比如float32 变为float16

这个方法肯定会有用，但我还没找到便捷的方法

3.7 使用torch.cuda.empty_cache()，释放不需要的显存

有用，但节约的空间还是不够

CUDA out of memory. Tried to allocate 244.00 MiB (GPU 0; 2.00 GiB total capacity; 1.12 GiB already allocated; 191.96 MiB free; 1.16 GiB reserved in total by PyTorch)

最后！！！！

问题解决办法是换个好点的显卡，代码优化纯属杂技。