InceptionV3网络结构讲解(Tensorflow-2.6.0实现网络结构)
文章目录
1.论文下载地址
https://arxiv.org/abs/1512.00567
2.结构表
3.改进的三种inception
原始的InceptionV1模块:
关于以下模块的数量和计算的计算量可以参考InceptionV1这篇文章:
https://mydreamambitious.blog.csdn.net/article/details/124237000
(1)改进inception模块1:
注:上面的5x5卷积可以使用两个3x3的卷积代替,并且很大程度上减少参数量;
为什么上面的5x5卷积可以使用两个3x3的卷积代替呢?
主要是因为感受野是相同的,并且这样也可以降低参数量;其中7x7的卷积可以使用三个3x3的卷积代替。
参数量的计算对比:假设有一个5x5 feature map。
(1)直接使用5x5卷积核进行卷积:
参数量:(5x5xC)xC=25C^2;
计算量:(WxHxC)x(5x5xC)=25WHC^2
(2)使用两个3x3的卷积代替5x5的卷积:
参数量:2x(3x3xC)xC=18C^2;
计算量:2x(WxHxC)x(3x3xC)=18WHC^2
可以对比发现,参数量和计算量相对于直接使用5x5的卷积要减少了很多。
参数量的计算对比:假设有一个7x7 feature map。
(3)直接使用7x7卷积核进行卷积:
参数量:(7x7xC)xC=49C^2;
计算量:(WxHxC)x(7x7xC)=49WHC^2
(4)使用三个3x3的卷积代替7x7的卷积:
参数量:3x(3x3xC)xC=27C^2;
计算量:3x(WxHxC)x(3x3xC)=27WHC^2
可以对比发现,参数量和计算量相对于直接使用7x7的卷积要减少了近一半的参数。
注:卷积神经网络中,感受野(Receptive Field)的定义是卷积神经网络每一层输出特征图(feature map)上的像素点在输入图片上映射的区域大小。
(1)改进inception模块2:
将5x5的卷积使用两个1x3,3x1的卷积来代替;3x3的卷积使用1x3,3x1的卷积来代替。
(5)如果是使用两个3x3的卷积代替5x5的卷积:
参数量:2x(3x3xC)xC=18C^2;
计算量:2x(WxHxC)x(3x3xC)=18WHC^2
(6)如上图所示,使用两个1x3,3x1的卷积代替5x5的卷积:
参数量:(1x3xC)xC+(3x1xC)xC+(1x3xC)xC+(3x1xC)xC=12C^2;
计算量:2x[(WxHxC)x(1x3xC)+(WxHxC)x(3x1xC)]=12WHC^2
(7)使用1x3,3x1的卷积代替3x3的卷积:
参数量:(1x3xC)xC+(3x1xC)xC=6C^2;
计算量:[(WxHxC)x(1x3xC)+(WxHxC)x(3x1xC)]=6WHC^2
可以对比发现,参数量和计算量相对于使用两个3x3的卷积代替5x5的卷积要减少了很多。
(1)改进inception模块3:
该结构主要用于扩充通道数,所以放在了所有Inception模块的最后。
4.特征图缩放和通道数增加的方式
对于方式一:先池化再进行升维的话,那么在池化的过程将丢失很多信息,对于后面输出的特征图提取的图像中的特征将会更少,违反了原则一;
方式二:先升维再池化的话,那么计算量将增加三倍,对于训练来说是不利的。
改进之后:
改进方案:在扩充通道数的同时下采样,也保证了计算效率。
5.四条原则
(1)原则一:
避免过度降维或者收缩特征bottleneck,特别是在网络浅层,因为在浅层过度的降维的话,将导致过多的信息丢失;降维会造成各通道间的相关性信息丢失,仅反映了致密的嵌入信息;
(2)原则二:
特征越多,收敛越快,相互独立的特征就越多,输入的信息分解的越彻底。
(3)原则三:
3x3和5x5大卷积核之前可用1x1卷积降维;大的卷积可以聚合空间信息的作用和大的感受野,因为使用1x1卷积,这样做不仅可以降低计算量和参数量,也是因为邻近单元的强相关性在降维的过程中信息损失很少。
(4)原则四:
均衡网络的宽度和深度,两者同时提升,既可以提高性能,也可以提高计算效率,不像VGG16大多数的参数量都集中在全连接层,这样做不利于提升性能和计算效率,而Inception则将参数均衡的分布在各层,使网络和宽度和深度更加的均衡,最后的计算效率和性能都会有所提升。
6.总结
(1)GoogLeNet成功的原因是因为在网络大量使用1x1的卷积降维,降低计算量和参数量(1x1卷积可以看成是一种特殊的卷积分解,提高了计算效率)。
(2)相邻感受野的卷积是高度相关的,使用1x1的卷积有利于保留相邻单元之间的相关性。
6.在InceptionV1中有两个辅助分来器,在训练快结束的时候,带有辅助分来器头的模型精度会更高;但是InceptionV3中取消了:因为辅助分来器不能帮助模型更快的收敛,去掉浅层的辅助分器头没有什么影响。
7.网络结构实现
import os
import keras
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Model
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True # TensorFlow按需分配显存
config.gpu_options.per_process_gpu_memory_fraction = 0.5 # 指定显存分配比例
inceptionV3_One={'1a':[64,48,64,96,96,32],
'2a':[64,48,64,96,96,64],
'3a':[64,48,64,96,96,64]
}
inceptionV3_Two={'1b':[192,128,128,192,128,128,128,128,192,192],
'2b':[192,160,160,192,160,160,160,160,192,192],
'3b':[192,160,160,192,160,160,160,160,192,192],
'4b':[192,192,192,192,192,192,192,192,192,192]
}
keys_two=(list)(inceptionV3_Two.keys())
inceptionV3_Three={
'1c':[320,384,384,384,448,384,384,384,192],
'2c':[320,384,384,384,448,384,384,384,192]
}
keys_three=(list)(inceptionV3_Three.keys())
def InceptionV3(inceptionV3_One,inceptionV3_Two,inceptionV3_Three):
keys_one=(list)(inceptionV3_One.keys())
keys_two = (list)(inceptionV3_Two.keys())
keys_three = (list)(inceptionV3_Three.keys())
input=layers.Input(shape=[299,299,3])
# 输入部分
conv1_one = layers.Conv2D(32, kernel_size=[3, 3], strides=[2, 2], padding='valid')(input)
conv1_batch=layers.BatchNormalization()(conv1_one)
conv1relu=layers.Activation('relu')(conv1_batch)
conv2_one = layers.Conv2D(32, kernel_size=[3, 3], strides=[1,1],padding='valid')(conv1relu)
conv2_batch=layers.BatchNormalization()(conv2_one)
conv2relu=layers.Activation('relu')(conv2_batch)
conv3_padded = layers.Conv2D(64, kernel_size=[3, 3], strides=[1,1],padding='same')(conv2relu)
conv3_batch=layers.BatchNormalization()(conv3_padded)
con3relu=layers.Activation('relu')(conv3_batch)
pool1_one = layers.MaxPool2D(pool_size=[3, 3], strides=[2, 2])(con3relu)
conv4_one = layers.Conv2D(80, kernel_size=[3,3], strides=[1,1], padding='valid')(pool1_one)
conv4_batch=layers.BatchNormalization()(conv4_one)
conv4relu=layers.Activation('relu')(conv4_batch)
conv5_one = layers.Conv2D(192, kernel_size=[3, 3], strides=[2,2], padding='valid')(conv4relu)
conv5_batch = layers.BatchNormalization()(conv5_one)
x=layers.Activation('relu')(conv5_batch)
"""
filter11:1x1的卷积核个数
filter13:3x3卷积之前的1x1卷积核个数
filter33:3x3卷积个数
filter15:使用3x3卷积代替5x5卷积之前的1x1卷积核个数
filter55:使用3x3卷积代替5x5卷积个数
filtermax:最大池化之后的1x1卷积核个数
"""
for i in range(3):
conv11 = layers.Conv2D((int)(inceptionV3_One[keys_one[i]][0]), kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
batchnormaliztion11 = layers.BatchNormalization()(conv11)
conv11relu = layers.Activation('relu')(batchnormaliztion11)
conv13 = layers.Conv2D((int)(inceptionV3_One[keys_one[i]][1]), kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
batchnormaliztion13 = layers.BatchNormalization()(conv13)
conv13relu = layers.Activation('relu')(batchnormaliztion13)
conv33 = layers.Conv2D((int)(inceptionV3_One[keys_one[i]][2]), kernel_size=[5, 5], strides=[1, 1], padding='same')(conv13relu)
batchnormaliztion33 = layers.BatchNormalization()(conv33)
conv33relu = layers.Activation('relu')(batchnormaliztion33)
conv1533 = layers.Conv2D((int)(inceptionV3_One[keys_one[i]][3]), kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
batchnormaliztion1533 = layers.BatchNormalization()(conv1533)
conv1522relu = layers.Activation('relu')(batchnormaliztion1533)
conv5533first = layers.Conv2D((int)(inceptionV3_One[keys_one[i]][4]), kernel_size=[3, 3], strides=[1, 1], padding='same')(conv1522relu)
batchnormaliztion5533first = layers.BatchNormalization()(conv5533first)
conv5533firstrelu = layers.Activation('relu')(batchnormaliztion5533first)
conv5533last = layers.Conv2D((int)(inceptionV3_One[keys_one[i]][4]), kernel_size=[3, 3], strides=[1, 1], padding='same')(conv5533firstrelu)
batchnormaliztion5533last = layers.BatchNormalization()(conv5533last)
conv5533lastrelu = layers.Activation('relu')(batchnormaliztion5533last)
maxpool = layers.AveragePooling2D(pool_size=[3, 3], strides=[1, 1], padding='same')(x)
maxconv11 = layers.Conv2D((int)(inceptionV3_One[keys_one[i]][5]), kernel_size=[1, 1], strides=[1, 1], padding='same')(maxpool)
batchnormaliztionpool = layers.BatchNormalization()(maxconv11)
convmaxrelu = layers.Activation('relu')(batchnormaliztionpool)
x=tf.concat([
conv11relu,conv33relu,conv5533lastrelu,convmaxrelu
],axis=3)
conv1_two = layers.Conv2D(384, kernel_size=[3, 3], strides=[2, 2], padding='valid')(x)
conv1batch=layers.BatchNormalization()(conv1_two)
conv1_tworelu=layers.Activation('relu')(conv1batch)
conv2_two = layers.Conv2D(64, kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
conv2batch=layers.BatchNormalization()(conv2_two)
conv2_tworelu=layers.Activation('relu')(conv2batch)
conv3_two = layers.Conv2D( 96, kernel_size=[3, 3], strides=[1,1], padding='same')(conv2_tworelu)
conv3batch=layers.BatchNormalization()(conv3_two)
conv3_tworelu=layers.Activation('relu')(conv3batch)
conv4_two = layers.Conv2D( 96, kernel_size=[3, 3], strides=[2, 2], padding='valid')(conv3_tworelu)
conv4batch=layers.BatchNormalization()(conv4_two)
conv4_tworelu=layers.Activation('relu')(conv4batch)
maxpool = layers.MaxPool2D(pool_size=[3, 3], strides=[2, 2])(x)
x=tf.concat([
conv1_tworelu,conv4_tworelu,maxpool
],axis=3)
"""
filter11:1x1的卷积核个数
filter13:使用1x3,3x1卷积代替3x3卷积之前的1x1卷积核个数
filter33:使用1x3,3x1卷积代替3x3卷积的个数
filter15:使用1x3,3x1,1x3,3x1卷积卷积代替5x5卷积之前的1x1卷积核个数
filter55:使用1x3,3x1,1x3,3x1卷积代替5x5卷积个数
filtermax:最大池化之后的1x1卷积核个数
"""
for i in range(4):
conv11 = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][0]), kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
batchnormaliztion11 = layers.BatchNormalization()(conv11)
conv11relu=layers.Activation('relu')(batchnormaliztion11)
conv13 = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][1]), kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
batchnormaliztion13 = layers.BatchNormalization()(conv13)
conv13relu=layers.Activation('relu')(batchnormaliztion13)
conv3313 = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][2]), kernel_size=[1, 7], strides=[1, 1], padding='same')(conv13relu)
batchnormaliztion3313 = layers.BatchNormalization()(conv3313)
conv3313relu=layers.Activation('relu')(batchnormaliztion3313)
conv3331 = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][3]), kernel_size=[7, 1], strides=[1, 1], padding='same')(conv3313relu)
batchnormaliztion3331 = layers.BatchNormalization()(conv3331)
conv3331relu=layers.Activation('relu')(batchnormaliztion3331)
conv15 = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][4]), kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
batchnormaliztion15 = layers.BatchNormalization()(conv15)
conv15relu=layers.Activation('relu')(batchnormaliztion15)
conv1513first = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][5]), kernel_size=[1, 7], strides=[1, 1], padding='same')(conv15relu)
batchnormaliztion1513first = layers.BatchNormalization()(conv1513first)
conv1513firstrelu=layers.Activation('relu')(batchnormaliztion1513first)
conv1531second = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][6]), kernel_size=[7, 1], strides=[1, 1], padding='same')(conv1513firstrelu)
batchnormaliztion1531second = layers.BatchNormalization()(conv1531second)
conv1531second=layers.Activation('relu')(batchnormaliztion1531second)
conv1513third = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][7]), kernel_size=[1, 7], strides=[1, 1], padding='same')(conv1531second)
batchnormaliztion1513third = layers.BatchNormalization()(conv1513third)
conv1513thirdrelu=layers.Activation('relu')(batchnormaliztion1513third)
conv1531last = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][8]), kernel_size=[7, 1], strides=[1, 1], padding='same')(conv1513thirdrelu)
batchnormaliztion1531last = layers.BatchNormalization()(conv1531last)
conv1531lastrelu=layers.Activation('relu')(batchnormaliztion1531last)
maxpool = layers.AveragePooling2D(pool_size=[3, 3], strides=[1, 1], padding='same')(x)
maxconv11 = layers.Conv2D((int)(inceptionV3_Two[keys_two[i]][9]), kernel_size=[1, 1], strides=[1, 1], padding='same')(maxpool)
maxconv11relu = layers.BatchNormalization()(maxconv11)
maxconv11relu = layers.Activation('relu')(maxconv11relu)
x=tf.concat([
conv11relu,conv3331relu,conv1531lastrelu,maxconv11relu
],axis=3)
conv11_three=layers.Conv2D(192, kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
conv11batch=layers.BatchNormalization()(conv11_three)
conv11relu=layers.Activation('relu')(conv11batch)
conv33_three=layers.Conv2D(320, kernel_size=[3, 3], strides=[2, 2], padding='valid')(conv11relu)
conv33batch=layers.BatchNormalization()(conv33_three)
conv33relu=layers.Activation('relu')(conv33batch)
conv7711_three=layers.Conv2D(192, kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
conv77batch=layers.BatchNormalization()(conv7711_three)
conv77relu=layers.Activation('relu')(conv77batch)
conv7717_three=layers.Conv2D(192, kernel_size=[1, 7], strides=[1, 1], padding='same')(conv77relu)
conv7717batch=layers.BatchNormalization()(conv7717_three)
conv7717relu=layers.Activation('relu')(conv7717batch)
conv7771_three=layers.Conv2D(192, kernel_size=[7, 1], strides=[1, 1], padding='same')(conv7717relu)
conv7771batch=layers.BatchNormalization()(conv7771_three)
conv7771relu=layers.Activation('relu')(conv7771batch)
conv33_three=layers.Conv2D(192, kernel_size=[3, 3], strides=[2, 2], padding='valid')(conv7771relu)
conv3377batch=layers.BatchNormalization()(conv33_three)
conv3377relu=layers.Activation('relu')(conv3377batch)
convmax_three=layers.MaxPool2D(pool_size=[3, 3], strides=[2, 2])(x)
x=tf.concat([
conv33relu,conv3377relu,convmax_three
],axis=3)
"""
filter11:1x1的卷积核个数
filter13:使用1x3,3x1卷积代替3x3卷积之前的1x1卷积核个数
filter33:使用1x3,3x1卷积代替3x3卷积的个数
filter15:使用3x3卷积代替5x5卷积之前的1x1卷积核个数
filter55:使用3x3卷积代替5x5卷积个数
filtermax:最大池化之后的1x1卷积核个数
"""
for i in range(2):
conv11 = layers.Conv2D((int)(inceptionV3_Three[keys_three[i]][0]), kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
batchnormaliztion11 = layers.BatchNormalization()(conv11)
conv11relu=layers.Activation('relu')(batchnormaliztion11)
conv13 = layers.Conv2D((int)(inceptionV3_Three[keys_three[i]][1]), kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
batchnormaliztion13 = layers.BatchNormalization()(conv13)
conv13relu=layers.Activation('relu')(batchnormaliztion13)
conv33left = layers.Conv2D((int)(inceptionV3_Three[keys_three[i]][2]), kernel_size=[1, 3], strides=[1, 1], padding='same')(conv13relu)
batchnormaliztion33left = layers.BatchNormalization()(conv33left)
conv33leftrelu=layers.Activation('relu')(batchnormaliztion33left)
conv33right = layers.Conv2D((int)(inceptionV3_Three[keys_three[i]][3]), kernel_size=[3, 1], strides=[1, 1], padding='same')(conv33leftrelu)
batchnormaliztion33right = layers.BatchNormalization()(conv33right)
conv33rightrelu=layers.Activation('relu')(batchnormaliztion33right)
conv33rightleft=tf.concat([
conv33leftrelu,conv33rightrelu
],axis=3)
conv15 = layers.Conv2D((int)(inceptionV3_Three[keys_three[i]][4]), kernel_size=[1, 1], strides=[1, 1], padding='same')(x)
batchnormaliztion15 = layers.BatchNormalization()(conv15)
conv15relu=layers.Activation('relu')(batchnormaliztion15)
conv1533 = layers.Conv2D((int)(inceptionV3_Three[keys_three[i]][5]), kernel_size=[3, 3], strides=[1, 1], padding='same')(conv15relu)
batchnormaliztion1533 = layers.BatchNormalization()(conv1533)
conv1533relu=layers.Activation('relu')(batchnormaliztion1533)
conv1533left = layers.Conv2D((int)(inceptionV3_Three[keys_three[i]][6]), kernel_size=[1, 3], strides=[1, 1], padding='same')(conv1533relu)
batchnormaliztion1533left = layers.BatchNormalization()(conv1533left)
conv1533leftrelu=layers.Activation('relu')(batchnormaliztion1533left)
conv1533right = layers.Conv2D((int)(inceptionV3_Three[keys_three[i]][6]), kernel_size=[3, 1], strides=[1, 1], padding='same')(conv1533leftrelu)
batchnormaliztion1533right = layers.BatchNormalization()(conv1533right)
conv1533rightrelu=layers.Activation('relu')(batchnormaliztion1533right)
conv1533leftright=tf.concat([
conv1533right,conv1533rightrelu
],axis=3)
maxpool = layers.AveragePooling2D(pool_size=[3, 3], strides=[1, 1],padding='same')(x)
maxconv11 = layers.Conv2D((int)(inceptionV3_Three[keys_three[i]][8]), kernel_size=[1, 1], strides=[1, 1], padding='same')(maxpool)
batchnormaliztionpool = layers.BatchNormalization()(maxconv11)
maxrelu = layers.Activation('relu')(batchnormaliztionpool)
x=tf.concat([
conv11relu,conv33rightleft,conv1533leftright,maxrelu
],axis=3)
x=layers.GlobalAveragePooling2D()(x)
x=layers.Dense(1000)(x)
softmax=layers.Activation('softmax')(x)
model_inceptionV3=Model(inputs=input,outputs=softmax,name='InceptionV3')
return model_inceptionV3
model_inceptionV3=InceptionV3(inceptionV3_One,inceptionV3_Two,inceptionV3_Three)
model_inceptionV3.summary()
模型输出结果:
中间部分未截图:
更多推荐
所有评论(0)