深度学习笔记：windows+tensorflow 指定GPU占用内存（解决gpu爆炸问题）

tensorflow

一个面向所有人的开源机器学习框架

项目地址：https://gitcode.com/gh_mirrors/te/tensorflow

免费下载资源

Big_quant

18041人浏览 · 2018-07-05 08:35:13

Big_quant · 2018-07-05 08:35:13 发布

前言

最近在使用深度学习，跑了一个大的model，然后GPU炸了，上网搜索了一下如何解决这个问题，做下笔记，分享给大家。
keras在使用GPU的时候有个特点，就是默认全部占满显存。这样如果有多个模型都需要使用GPU跑的话，那么限制是很大的，而且对于GPU也是一种浪费。因此在使用keras时需要有意识的设置运行时使用那块显卡，需要使用多少容量。

具体可以分为以下三种情况：

指定显卡
限制GPU用量
即指定显卡又限制GPU用量

查看GPU使用情况语句（linux）

# 1秒钟刷新一次
watch -n 1 nvidia-smi

一、指定显卡

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2"

这里指定了使用编号为2的GPU，大家可以根据需要和实际情况来指定需要使用的GPU。

二、限制GPU用量

1、设置使用GPU的百分比

import tensorflow as tf
import keras.backend.tensorflow_backend as KTF

进行配置，使用30%的GPU

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.3
session = tf.Session(config=config)

设置session

KTF.set_session(session )

需要注意的是，虽然代码或配置层面设置了对显存占用百分比阈值，但在实际运行中如果达到了这个阈值，程序有需要的话还是会突破这个阈值。换而言之如果跑在一个大数据集上还是会用到更多的显存。以上的显存限制仅仅为了在跑小数据集时避免对显存的浪费而已。

2、GPU按需使用

import tensorflow as tf
import keras.backend.tensorflow_backend as KTF

config = tf.ConfigProto()  
config.gpu_options.allow_growth=True   #不全部占满显存, 按需分配
session = tf.Session(config=config)

# 设置session
KTF.set_session(sess)

三、指定GPU并且限制GPU用量

这个比较简单，就是讲上面两种情况连上即可。。。

import os
import tensorflow as tf
import keras.backend.tensorflow_backend as KTF

指定第一块GPU可用

os.environ["CUDA_VISIBLE_DEVICES"] = "0"

config = tf.ConfigProto()  
config.gpu_options.allow_growth=True   #不全部占满显存, 按需分配
sess = tf.Session(config=config)

KTF.set_session(sess)

keras在使用GPU的时候有个特点，就是默认全部占满显存。
若单核GPU也无所谓，若是服务器GPU较多，性能较好，全部占满就太浪费了。
于是乎有以下三种情况：
在使用keras时候会出现总是占满GPU显存的情况，可以通过重设backend的GPU占用情况来进行调节。

import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.3
set_session(tf.Session(config=config))

上述两个连一起用就行：

import os
import tensorflow as tf
os.environ["CUDA_VISIBLE_DEVICES"] = "2"
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.3
set_session(tf.Session(config=config))

CUDA_VISIBLE_DEVICES=0 python -m nmt.nmt

这个问题在GitHub上已经摞了很高的，貌似是Windows特有的，而且和显存容量有关。最后有一位宛如救世主的老兄给出了他的总结性发言与变相的解决方案

    Here is a bit more info on how I temporarily resolved it. I believe these issues are all related to GPU memory allocation and have nothing to do with the errors being reported. There were other errors before this indicating some sort of memory allocation problem but the program continued to progress, eventually giving the cudnn errors that everyone is getting. The reason I believe it works sometimes is that if you use the gpu for other things besides tensorflow such as your primary display, the available memory fluctuates. Sometimes you can allocate what you need and other times it can’t.

From the API
https://www.tensorflow.org/versions/r0.12/how_tos/using_gpu/
```
“By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to CUDA_VISIBLE_DEVICES) visible to the process. This is done to more efficiently use the relatively precious GPU memory resources on the devices by reducing memory fragmentation.”

I think this default allocation is broken in some way that causes this erratic behavior and certain situations to work and others to fail.

I have resolved this issue by changing the default behavior of TF to allocate a minimum amount of memory and grow as needed as detailed in the webpage.
```
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, …)

    I have also tried the alternate way and was able to get it to work and fail with experimentally choosing a percentage that worked. In my case it ended up being about .7.

 config = tf.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 0.4
    session = tf.Session(config=config, …)

Still no word from anyone on the TF team confirming this but it is worth a shot to see if others can confirm similar behavior.

答疑

如果还有其他问题，可以关注公众号，答主会在24h之内回复你。
在这里插入图片描述

GitHub 加速计划 / te / tensorflow

184.55 K

74.12 K

下载

一个面向所有人的开源机器学习框架

最近提交(Master分支：2 个月前 )

a49e66f2 PiperOrigin-RevId: 663726708 2 个月前

91dac11a This test overrides disabled_backends, dropping the default value in the process. PiperOrigin-RevId: 663711155 2 个月前

GitCode 开源社区

旨在为数千万中国开发者提供一个无缝且高效的云端环境，以支持学习、使用和贡献开源项目。

更多推荐

[转载]在Windows环境下安装GNU Radio

转自：在Windows环境下安装GNURadio_恐弱智_新浪博客GNU Radio是用Python开发的，大部分开源的工程能够在Linux环境下运行良好，而Windows下却运行的很勉强，而且安装配置都很复杂。GNU Radio算是个例外了，不光提供了Windows的二进制安装，还有比较详细的说明。我是Python小白，所以折腾了好久才弄好，特意记录下来，免得以后再装还折腾。GNU Radio的

GitCode 开源社区

centOS 8 使用dnf安装Docker

DNF是什么？CentOS 8使用YUM软件包管理器版本v4.0.4。现在，该版本使用DNF(已删除YUM)。DNF是软件包管理器。它会在Linux发行版上安装，执行更新并删除软件包。使用DNF安装Docker跳过具有损坏依赖性的程序包一个有效的解决方案是使您的CentOS 8系统使用以下--nobest命令安装最符合条件的版本：sudo dnf install docker...

GitCode 开源社区

定时同步数据库表(mysql+linux+crontab)

sync.sh里面的参数需要改变，ip/username/password/database/tablesync.sh#!/bin/sh# Please change the IP and password of the data source db.# Then change the table name.filename=/home/nington/db/$(date +%Y-%m