关于Tensorflow中的tf.train.batch函数

silence1214

37980人浏览 · 2017-08-13 12:06:56

silence1214 · 2017-08-13 12:06:56 发布

这两天一直在看tensorflow中的读取数据的队列，说实话，真的是很难懂。也可能我之前没这方面的经验吧，最早我都使用的theano，什么都是自己写。经过这两天的文档以及相关资料，并且请教了国内的师弟。今天算是有点小感受了。简单的说，就是计算图是从一个管道中读取数据的，录入管道是用的现成的方法，读取也是。为了保证多线程的时候从一个管道读取数据不会乱吧，所以这种时候读取的时候需要线程管理的相关操作。今天我实验室了一个简单的操作，就是给一个有序的数据，看看读出来是不是有序的，结果发现是有序的，所以直接给代码：

import tensorflow as tf
import numpy as np

def generate_data():
    num = 25
    label = np.asarray(range(0, num))
    images = np.random.random([num, 5, 5, 3])
    print('label size :{}, image size {}'.format(label.shape, images.shape))
    return label, images

def get_batch_data():
    label, images = generate_data()
    images = tf.cast(images, tf.float32)
    label = tf.cast(label, tf.int32)
    input_queue = tf.train.slice_input_producer([images, label], shuffle=False)
    image_batch, label_batch = tf.train.batch(input_queue, batch_size=10, num_threads=1, capacity=64)
    return image_batch, label_batch

image_batch, label_batch = get_batch_data()
with tf.Session() as sess:
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess, coord)
    i = 0
    try:
        while not coord.should_stop():
            image_batch_v, label_batch_v = sess.run([image_batch, label_batch])
            i += 1
            for j in range(10):
                print(image_batch_v.shape, label_batch_v[j])
    except tf.errors.OutOfRangeError:
        print("done")
    finally:
        coord.request_stop()
    coord.join(threads)

记得那个slice_input_producer方法，默认是要shuffle的哈。

Besides, I would like to comment this code.
1: there is a parameter ‘num_epochs’ in slice_input_producer, which controls how many epochs the slice_input_producer method would work. when this method runs the specified epochs, it would report the OutOfRangeRrror. I think it would be useful for our control the training epochs.
2: the output of this method is one single image, we could operate this single image with tensorflow API, such as normalization, crops, and so on, then this single image is feed to batch method, a batch of images for training or testing would be received.

GitCode 开源社区

新一代开源开发者平台 GitCode，通过集成代码托管服务、代码仓库以及可信赖的开源组件库，让开发者可以在云端进行代码托管和开发。旨在为数千万中国开发者提供一个无缝且高效的云端环境，以支持学习、使用和贡献开源项目。

更多推荐

libmd 实现详解：仓颉语言中的哈希算法库开发实践

GitCode 开源社区

仓颉迁移实战：将 Node.js 微服务移植到 Cangjie 的工程化评测

GitCode 开源社区

librtp 实现详解：仓颉语言中的 RTP和RTCP 协议库开发实践

在实时音视频通信、流媒体传输、网络监控等场景中，RTP/RTCP 协议是确保数据正确传输和实时反馈的关键技术。RTP 负责传输音视频数据，RTCP 负责传输控制信息，两者配合工作，共同保障实时通信的质量。librtp 库旨在为仓颉语言提供一套完整、高效、易用的 RTP/RTCP 协议处理解决方案，支持 RTP 包的创建、读取和序列化，RTCP 包的创建和解析，NTP 时间戳和 RTP 时间戳的转换