本文展示了如何使用循环神经网络去估计一个向量序列,我们会使用到LSTM的网络。我在网上找的

大多数用到LSTM的例子都是用来解决自然语言处理方面问题的,还没有找到相关的例子可以用在预测连续

值序列上,所以写下了本文。


所以本文的任务是基于历史观察数据去预测一系列连续的实数。传统的神经网络做不到这一点,但是循环神经

网络可以解决该问题,因为他们能够存储历史信息来预测未来事件。


在下面的一个例子中我们将尝试预测一组函数sin、cos和x*sin(x).

首先让我们建立一个模型lstm_model, 这个模型是一系列不同时间步的lstm单元的堆叠,后接一个深层网络。

def lstm_model(time_steps, rnn_layers, dense_layers=None):
      """
      Creates a deep model based on:
          * stacked lstm cells
          * an optional dense layers
      :param time_steps: the number of time steps the model will be looking at.
      :param rnn_layers: list of int or dict
                           * list of int: the steps used to instantiate the `BasicLSTMCell` cell
                           * list of dict: [{steps: int, keep_prob: int}, ...]
     :param dense_layers: list of nodes for each layer
     :return: the model definition
     """
 
     def lstm_cells(layers):
         if isinstance(layers[0], dict):
             return [tf.nn.rnn_cell.DropoutWrapper(tf.nn.rnn_cell.BasicLSTMCell(layer['steps'],
                                                                                state_is_tuple=True),
                                                   layer['keep_prob'])
                     if layer.get('keep_prob') else tf.nn.rnn_cell.BasicLSTMCell(layer['steps'],
                                                                                 state_is_tuple=True)
                     for layer in layers]
         return [tf.nn.rnn_cell.BasicLSTMCell(steps, state_is_tuple=True) for steps in layers]
 
     def dnn_layers(input_layers, layers):
         if layers and isinstance(layers, dict):
             return learn.ops.dnn(input_layers,
                                  layers['layers'],
                                  activation=layers.get('activation'),
                                  dropout=layers.get('dropout'))
         elif layers:
             return learn.ops.dnn(input_layers, layers)
         else:
             return input_layers
 
     def _lstm_model(X, y):
         stacked_lstm = tf.nn.rnn_cell.MultiRNNCell(lstm_cells(rnn_layers), state_is_tuple=True)
         x_ = learn.ops.split_squeeze(1, time_steps, X)
         output, layers = tf.nn.rnn(stacked_lstm, x_, dtype=dtypes.float32)
         output = dnn_layers(output[-1], dense_layers)
         return learn.models.linear_regression(output, y)
 
     return _lstm_model

所以我们的模型接收的数据维度应该是这样的:(batch size,time steps of the first lstm cell, num_features)

下一步要做的就是把我们的数据重整成模型可以接收的格式。

def rnn_data(data, time_steps, labels=False):
      """
      creates new data frame based on previous observation
        * example:
          l = [1, 2, 3, 4, 5]
          time_steps = 2
          -> labels == False [[1, 2], [2, 3], [3, 4]]
          -> labels == True [2, 3, 4, 5]
      """
     rnn_df = []
     for i in range(len(data) - time_steps):
         if labels:
             try:
                 rnn_df.append(data.iloc[i + time_steps].as_matrix())
             except AttributeError:
                 rnn_df.append(data.iloc[i + time_steps])
         else:
             data_ = data.iloc[i: i + time_steps].as_matrix()
             rnn_df.append(data_ if len(data_.shape) > 1 else [[i] for i in data_])
     return np.array(rnn_df)
 
 
 def split_data(data, val_size=0.1, test_size=0.1):
     """
     splits data to training, validation and testing parts
     """
     ntest = int(round(len(data) * (1 - test_size)))
     nval = int(round(len(data.iloc[:ntest]) * (1 - val_size)))
 
     df_train, df_val, df_test = data.iloc[:nval], data.iloc[nval:ntest], data.iloc[ntest:]
 
     return df_train, df_val, df_test
 
 
 def prepare_data(data, time_steps, labels=False, val_size=0.1, test_size=0.1):
     """
     Given the number of `time_steps` and some data,
     prepares training, validation and test data for an lstm cell.
     """
     df_train, df_val, df_test = split_data(data, val_size, test_size)
     return (rnn_data(df_train, time_steps, labels=labels),
             rnn_data(df_val, time_steps, labels=labels),
             rnn_data(df_test, time_steps, labels=labels))
 
 
 def generate_data(fct, x, time_steps, seperate=False):
     """generates data with based on a function fct"""
     data = fct(x)
     if not isinstance(data, pd.DataFrame):
         data = pd.DataFrame(data)
     train_x, val_x, test_x = prepare_data(data['a'] if seperate else data, time_steps)
     train_y, val_y, test_y = prepare_data(data['b'] if seperate else data, time_steps, labels=True)
     return dict(train=train_x, val=val_x, test=test_x), dict(train=train_y, val=val_y, test=test_y)
这就会产生数据允许我们的模型往序列的前time_steps回看来预测未来数据。例如第一个单元

是10个time steps的单元,那么每做一次预测,我们都需要输入10个历史数据点。我们想要预测的

数值应该和数据点里的第10个相关。


首先定义超参数

 LOG_DIR = './ops_logs'
 TIMESTEPS = 5
 RNN_LAYERS = [{'steps': TIMESTEPS}, {'steps': TIMESTEPS, 'keep_prob': 0.5}]
 DENSE_LAYERS = [2]
 TRAINING_STEPS = 130000
 BATCH_SIZE = 100
 PRINT_STEPS = TRAINING_STEPS / 100

现在可以建立一个回归模型

 regressor = learn.TensorFlowEstimator(model_fn=lstm_model(TIMESTEPS, RNN_LAYERS, DENSE_LAYERS),
                                       n_classes=0,
                                       verbose=1,  
                                       steps=TRAINING_STEPS,
                                       optimizer='Adagrad',
                                       learning_rate=0.03,
                                       batch_size=BATCH_SIZE)

预测sin函数

  X, y = generate_data(np.sin, np.linspace(0, 100, 10000), TIMESTEPS, seperate=False)
  # create a lstm instance and validation monitor
  validation_monitor = learn.monitors.ValidationMonitor(X['val'], y['val'],
                                                        every_n_steps=PRINT_STEPS,
                                                        early_stopping_rounds=1000)
  regressor.fit(X['train'], y['train'], validation_monitor, logdir=LOG_DIR)
  
 # > last training steps
 # Step #9700, epoch #119, avg. train loss: 0.00082, avg. val loss: 0.00084
 # Step #9800, epoch #120, avg. train loss: 0.00083, avg. val loss: 0.00082
 # Step #9900, epoch #122, avg. train loss: 0.00082, avg. val loss: 0.00082
 # Step #10000, epoch #123, avg. train loss: 0.00081, avg. val loss: 0.00081
预测测试数据

 mse = mean_squared_error(regressor.predict(X['test']), y['test'])
 print ("Error: {}".format(mse))
 # 0.000776
同时预测 sin和cos函数
def sin_cos(x):
      return pd.DataFrame(dict(a=np.sin(x), b=np.cos(x)), index=x)
  
X, y = generate_data(sin_cos, np.linspace(0, 100, 10000), TIMESTEPS, seperate=False)
# create a lstm instance and validation monitor
validation_monitor = learn.monitors.ValidationMonitor(X['val'], y['val'],
                                                       every_n_steps=PRINT_STEPS,
                                                       early_stopping_rounds=1000)
regressor.fit(X['train'], y['train'], validation_monitor, logdir=LOG_DIR)
 
# > last training steps
# Step #9500, epoch #117, avg. train loss: 0.00120, avg. val loss: 0.00118
# Step #9600, epoch #118, avg. train loss: 0.00121, avg. val loss: 0.00118
# Step #9700, epoch #119, avg. train loss: 0.00118, avg. val loss: 0.00118
# Step #9800, epoch #120, avg. train loss: 0.00118, avg. val loss: 0.00116
# Step #9900, epoch #122, avg. train loss: 0.00118, avg. val loss: 0.00115
# Step #10000, epoch #123, avg. train loss: 0.00117, avg. val loss: 0.00115
预测测试数据
mse = mean_squared_error(regressor.predict(X['test']), y['test'])
print ("Error: {}".format(mse))
# 0.001144

还有一个x*sinx的就不玩了,下面贴出英文原文

http://mourafiq.com/2016/05/15/predicting-sequences-using-rnn-in-tensorflow.html





GitHub 加速计划 / te / tensorflow
184.55 K
74.12 K
下载
一个面向所有人的开源机器学习框架
最近提交(Master分支:2 个月前 )
a49e66f2 PiperOrigin-RevId: 663726708 2 个月前
91dac11a This test overrides disabled_backends, dropping the default value in the process. PiperOrigin-RevId: 663711155 2 个月前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐