[TensorFlow 学习笔记-04]卷积函数之tf.nn.conv2d
[版权说明]
TensorFlow 学习笔记参考:
李嘉璇 著 TensorFlow技术解析与实战
黄文坚 唐源 著 TensorFlow实战郑泽宇
顾思宇 著 TensorFlow实战Google深度学习框架
乐毅 王斌 著 深度学习-Caffe之经典模型详解与实战
TensorFlow中文社区 http://www.tensorfly.cn/
极客学院 著 TensorFlow官方文档中文版
TensorFlow官方文档英文版
以及各位大大的CSDN博客和Github等等...希望本系列博文没有侵犯版权!(若侵权,请联系我,邮箱:1511082629@nbu.edu.cn )
欢迎大家转载分享,会不定期更新。鉴于博主本人水平有限,如有问题。恳请批评指正!
1. 卷积概念
2. 卷积函数tf.nn.conv2d
def conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None,
data_format=None, name=None):
r"""Computes a 2-D convolution given 4-D `input` and `filter` tensors.
Given an input tensor of shape `[batch, in_height, in_width, in_channels]`
and a filter / kernel tensor of shape
`[filter_height, filter_width, in_channels, out_channels]`, this op
performs the following:
1. Flattens the filter to a 2-D matrix with shape
`[filter_height * filter_width * in_channels, output_channels]`.
2. Extracts image patches from the input tensor to form a *virtual*
tensor of shape `[batch, out_height, out_width,
filter_height * filter_width * in_channels]`.
3. For each patch, right-multiplies the filter matrix and the image patch
vector.
In detail, with the default NHWC format,
output[b, i, j, k] =
sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] *
filter[di, dj, q, k]
Must have `strides[0] = strides[3] = 1`. For the most common case of the same
horizontal and vertices strides, `strides = [1, stride, stride, 1]`.
Args:
input: A `Tensor`. Must be one of the following types: `half`, `float32`.
A 4-D tensor. The dimension order is interpreted according to the value
of `data_format`, see below for details.
filter: A `Tensor`. Must have the same type as `input`.
A 4-D tensor of shape
`[filter_height, filter_width, in_channels, out_channels]`
strides: A list of `ints`.
1-D tensor of length 4. The stride of the sliding window for each
dimension of `input`. The dimension order is determined by the value of
`data_format`, see below for details.
padding: A `string` from: `"SAME", "VALID"`.
The type of padding algorithm to use.
use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.
data_format: An optional `string` from: `"NHWC", "NCHW"`. Defaults to `"NHWC"`.
Specify the data format of the input and output data. With the
default format "NHWC", the data is stored in the order of:
[batch, height, width, channels].
Alternatively, the format could be "NCHW", the data storage order of:
[batch, channels, height, width].
name: A name for the operation (optional).
Returns:
A `Tensor`. Has the same type as `input`.
A 4-D tensor. The dimension order is determined by the value of
`data_format`, see below for details.
"""
result = _op_def_lib.apply_op("Conv2D", input=input, filter=filter,
strides=strides, padding=padding,
use_cudnn_on_gpu=use_cudnn_on_gpu,
data_format=data_format, name=name)
return result
_conv2d_backprop_filter_outputs = ["output"]
通过conv2d源码我们可以发现一共有7个参数,参数的详细分析如下:
input: A `Tensor`. Must be one of the following types: `half`, `float32`.
A 4-D tensor. The dimension order is interpreted according to the value
of `data_format`, see below for details.
input tensor of shape `[batch, in_height, in_width, in_channels]`
通过源码中的描述(如上),我们可以知道input就是需要做卷积的图像(这里要求用Tensor来表示输入图像,并且Tensor(一个4维的Tensor,要求类型为half(half是什么东东?)或者float32)的shape为[batch, in_height, in_width, in_channels]具体含义[训练时一个batch图像的数量,图像高度,图像宽度, 图像通道数])。
filter: A `Tensor`. Must have the same type as `input`.
A 4-D tensor of shape
`[filter_height, filter_width, in_channels, out_channels]`
通过源码中的描述(如上),我们可以知道filter就是卷积核(这里要求用Tensor来表示卷积核,并且Tensor(一个4维的Tensor,要求类型与input相同)的shape为[filter_height, filter_width, in_channels, out_channels]具体含义[卷积核高度,卷积核宽度,图像通道数,卷积核个数],这里的图片通道数也就input中的图像通道数,二者相同。)
第三个参数:strides
strides: A list of `ints`.
1-D tensor of length 4. The stride of the sliding window for each
dimension of `input`. The dimension order is determined by the value of
`data_format`, see below for details.
通过源码中的描述(如上),我们可以知道strides就是卷积操作时在图像每一维的步长,strides是一个长度为4的一维向量。
第四个参数:padding
padding: A `string` from: `"SAME", "VALID"`.
The type of padding algorithm to use.
通过源码中的描述(如上),我们知道padding是一个string类型的变量,只能是 "SAME" 或者 "VALID",决定了两种不同的卷积方式。下面我们来介绍 "SAME" 和 "VALID" 的卷积方式,如下图我们使用单通道的图像,图像大小为5*5,卷积核用3*3。
"VALID" 卷积方式
具体卷积操作如下图(也是文中一开始用到的图),我们考虑卷积核中心点(这里卷积核大小是3*3,)走过的位置,
如下所示,红色#表示卷积核中心点在图像上的滑动过程。最后得到3*3的图像大小。
#####
#####
#####
#####
#####
"SAME"卷积方式
对于上图,图像的每一个点都作为卷积核的中心。最后得到5*5的结果,如下图:
通俗的来说:首先在原图外层补一圈0,将原图的第一点作为卷积核中心,若一圈0不够,继续补一圈0。
第五个参数:use_cudnn_on_gpu
use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.
通过源码中的描述(如上),我们知道use_cudnn_on_gpu就是选择是否用GPU进行运算加速。默认为True。
第六个参数:data_format
data_format: An optional `string` from: `"NHWC", "NCHW"`. Defaults to `"NHWC"`.
Specify the data format of the input and output data. With the
default format "NHWC", the data is stored in the order of:
[batch, height, width, channels].
Alternatively, the format could be "NCHW", the data storage order of:
[batch, channels, height, width].
通过源码中的描述(如上),我们知道data_format就是input的Tensor格式,一般默认就可以了。都采用NHWC。
第七个参数:name
name: A name for the operation (optional).
就是用以指定该操作的name,仅此而已。
函数返回值:
Returns:
A `Tensor`. Has the same type as `input`.
A 4-D tensor. The dimension order is determined by the value of
`data_format`, see below for details.
返回卷积操作后的特征图。
更多推荐
所有评论(0)