长短时记忆神经网络python代码_LSTM(长短期记忆网络)及其tensorflow代码应...--688IT编程网

长短时记忆神经⽹络python代码_LSTM（长短期记忆⽹络）及

其tensorflow代码应⽤

本⽂主要包括：

⼀、什么是LSTM

⼆、LSTM的曲线拟合

三、LSTM的分类问题

四、为什么LSTM有助于消除梯度消失

⼀、什么是LSTM

Long Short Term ⽹络即为LSTM，是⼀种循环神经⽹络(RNN)，可以学习长期依赖问题。RNN 都具有⼀种重复神经⽹络模块的链式的形式。在标准的 RNN 中，这个重复的模块只有⼀个⾮常简单的结构，例如⼀个 tanh 层。

如上为标准的RNN神经⽹络结构，LSTM则与此不同，其⽹络结构如图：

其中，⽹络中各个元素图标为：

LSTM 通过精⼼设计的称作为“门”的结构来去除或者增加信息到细胞状态的能⼒。门是⼀种让信息选择式通过的⽅法。他们包含⼀个sigmoid 神经⽹络层和⼀个 pointwise 乘法操作。LSTM 拥有三个门，来保护和控制细胞状态。

⾸先是忘记门：

如上，忘记门中需要注意的是，训练的是⼀个wf的权值，⽽且上⼀时刻的输出和当前时刻的输⼊是⼀个concat操作。忘记门决定我们会从细胞状态中丢弃什么信息，因为sigmoid函数的输出是⼀个⼩于1的值，相当于对每个维度上的值做⼀个衰减。

然后是信息增加门，决定了什么新的信息到细胞状态中：

其中，sigmoid决定了什么值需要更新，tanh创建⼀个新的细胞状态的候选向量Ct，该过程训练两个权值Wi和Wc。经过第⼀个和第⼆个门后，可以确定传递信息的删除和增加，即可以进⾏“细胞状态”的更新。

第三个门就是信息输出门：

通过sigmoid确定细胞状态那个部分将输出，tanh处理细胞状态得到⼀个-1到1之间的值，再将它和sigmoid门的输出相乘，输出程序确定输出的部分。

⼆、LSTM的曲线拟合

2.1 股票价格预测

下⾯介绍⼀个⽹上常⽤的利⽤LSTM做股票价格的回归例⼦，数据：

如上，可以看到⽤例包含：index_code,date,open,close,low,high,volume,money,change这样⼏个特征。提取特征从open-change个特征，作为神经⽹络的输⼊，输出即为label。整个代码如下：

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import tensorflow as tf

#定义常量

rnn_unit=10 #hidden layer units

input_size=7

output_size=1

lr=0.0006 #学习率

#——————————————————导⼊数据——————————————————————

f=open('dataset_2.csv')

ad_csv(f) #读⼊股票数据

data=df.iloc[:,2:10].values #取第3-10列

#获取训练集

def get_train_data(batch_size=60,time_step=20,train_begin=0,train_end=5800):

batch_index=[]

data_train=data[train_begin:train_end]

normalized_train_data=(an(data_train,axis=0))/np.std(data_train,axis=0) #标准化train_x,train_y=[],[] #训练集

for i in range(len(normalized_train_data)-time_step):

if i % batch_size==0:

batch_index.append(i)

x=normalized_train_data[i:i+time_step,:7]

y=normalized_train_data[i:i+time_step,waxis]

train_x.list())

train_y.list())

batch_index.append((len(normalized_train_data)-time_step))

return batch_index,train_x,train_y

#获取测试集

def get_test_data(time_step=20,test_begin=5800):

data_test=data[test_begin:]

an(data_test,axis=0)

std=np.std(data_test,axis=0)

normalized_test_data=(data_test-mean)/std #标准化

size=(len(normalized_test_data)+time_step-1)//time_step #有size个sample

test_x,test_y=[],[]

for i in range(size-1):

x=normalized_test_data[i*time_step:(i+1)*time_step,:7]

y=normalized_test_data[i*time_step:(i+1)*time_step,7]

test_x.list())

d(y)

test_x.append((normalized_test_data[(i+1)*time_step:,:7]).tolist())

d((normalized_test_data[(i+1)*time_step:,7]).tolist())

return mean,std,test_x,test_y

#——————————————————定义神经⽹络变量——————————————————

#输⼊层、输出层权重、偏置

weights={

'in':tf.Variable(tf.random_normal([input_size,rnn_unit])),

'out':tf.Variable(tf.random_normal([rnn_unit,1]))

}

biases={

'in':tf.stant(0.1,shape=[rnn_unit,])),

'out':tf.stant(0.1,shape=[1,]))

}

#——————————————————定义神经⽹络变量——————————————————

def lstm(X):

batch_size=tf.shape(X)[0]

time_step=tf.shape(X)[1]

w_in=weights['in']

b_in=biases['in']

shape(X,[-1,input_size]) #需要将tensor转成2维进⾏计算，计算后的结果作为隐藏层的输⼊

input_rnn=tf.matmul(input,w_in)+b_in

input_shape(input_rnn,[-1,time_step,rnn_unit]) #将tensor转成3维，作为lstm cell的输⼊

_cell.BasicLSTMCell(rnn_unit)

init__state(batch_size,dtype=tf.float32)

output_rnn,final_dynamic_rnn(cell, input_rnn,initial_state=init_state, dtype=tf.float32) #output_rnn是记录lstm每个输出节点的结果，final_states是最后⼀个cell的结果

shape(output_rnn,[-1,rnn_unit]) #作为输出层的输⼊

w_out=weights['out']

b_out=biases['out']

pred=tf.matmul(output,w_out)+b_out

return pred,final_states

#——————————————————训练模型——————————————————

def train_lstm(batch_size=80,time_step=15,train_begin=2000,train_end=5800):

X=tf.placeholder(tf.float32, shape=[None,time_step,input_size])

Y=tf.placeholder(tf.float32, shape=[None,time_step,output_size])

# 训练样本中第2001 - 5785个样本，每次取15个

batch_index,train_x,train_y=get_train_data(batch_size,time_step,train_begin,train_end)

print(np.array(train_x).shape)# 3785 15 7

variable怎么记

print(batch_index)

#相当于总共3785句话，每句话15个字，每个字7个特征(embadding)，对于这些样本每次训练80句话

pred,_=lstm(X)

#损失函数

duce_mean(tf.shape(pred,[-1])-tf.reshape(Y, [-1])))

train_ain.AdamOptimizer(lr).minimize(loss)

ain.Saver(tf.global_variables(),max_to_keep=15)

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

#重复训练200次

for i in range(200):

#每次进⾏训练的时候，每个batch训练batch_size个样本

for step in range(len(batch_index)-1):

_,loss_=sess.run([train_op,loss],feed_dict=

{X:train_x[batch_index[step]:batch_index[step+1]],Y:train_y[batch_index[step]:batch_index[step+1]]}) print(i,loss_)

if i % 200==0:

print("保存模型：",saver.save(sess,'del',global_step=i))

train_lstm()

#————————————————预测模型————————————————————

def prediction(time_step=20):

X=tf.placeholder(tf.float32, shape=[None,time_step,input_size])

mean,std,test_x,test_y=get_test_data(time_step)

pred,_=lstm(X)

ain.Saver(tf.global_variables())

with tf.Session() as sess:

#参数恢复

module_file = tf.train.latest_checkpoint('model')

test_predict=[]

for step in range(len(test_x)-1):

prob=sess.run(pred,feed_dict={X:[test_x[step]]})

shape((-1))

d(predict)

test_y=np.array(test_y)*std[7]+mean[7]

test_predict=np.array(test_predict)*std[7]+mean[7]

acc=np.average(np.abs(test_predict-test_y[:len(test_predict)])/test_y[:len(test_predict)]) #偏差

#以折线图表⽰结果

plt.figure()

plt.plot(list(range(len(test_predict))), test_predict, color='b')

plt.plot(list(range(len(test_y))), test_y, color='r')

plt.show()

prediction()

这个过程并不难理解，下⾯分析其中维度变换，从⽽增加对LSTM的理解。

对于RNN的⽹络的构建，可以从输⼊张量的维度上理解，这⾥我们使⽤dynamic_rnn(当然可以注意与static_rnn在使⽤上的区别)：

dynamic_rnn(

cell,

inputs,

sequence_length=None,

initial_state=None,

dtype=None,

parallel_iterations=None,

swap_memory=False,

time_major=False,

scope=None

)

其中：

cell：输⼊⼀个RNNcell实例

inputs:RNN神经⽹络的输⼊，如果 time_major == False (default)，输⼊的形状是: [batch_size, max_time, embedding_size]；如果time_major == True, 输⼊的形状是: [ max_time, batch_size, embedding_size]

initial_state: RNN⽹络的初始状态，⽹络需要⼀个初始状态，对于普通的RNN⽹络，初始状态的形状是:[batch_size, cell.state_size]

688IT编程网

长短时记忆神经网络python代码_LSTM(长短期记忆网络)及其tensorflow代码应...

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

m函数数字提取

jest断言方法大全

中兴ZXSEC US 管理员手册

keras系列(一):参数设置

Qt从QString中提取出数字

element input 金额千分位格式化

freemaker 参数解析正则

C#正则验证数字

form表单验证正则

scanf正则表达式用法

grafana value的正则表达式

Android平台浮点数运算应用

js-(JS正则表达式验证数字)

判断Python输入是否是整数,字符,或浮点数

c语言 sscanf 正则规则

从文本中提取数值技巧

js将整数转换成两位浮点数的方法

vue正则限制浮点数

8到20的结尾的正则

shell 正则表达式最后一行

最新文章

应用程序的安全检测方法、装置、电子设备和存储介质

VBA之正则表达式(1)--基础篇

代码编辑的辅助方法、装置及电子设备

SHELL查字符串中包含字符的命令

String方法中replace和replaceAll的区别详解(源码分析)

双字节符号正则

标签列表

688IT编程网

长短时记忆神经网络python代码_LSTM(长短期记忆网络)及其tensorflow代码应...

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法 正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

m函数数字提取

jest断言方法大全

中兴ZXSEC US 管理员手册

keras系列(一):参数设置

Qt从QString中提取出数字

element input 金额千分位格式化

freemaker 参数解析正则

C#正则验证数字

form表单验证正则

scanf正则表达式用法

grafana value的正则表达式

Android平台浮点数运算应用

js-(JS正则表达式验证数字)

判断Python输入是否是整数,字符,或浮点数

c语言 sscanf 正则规则

从文本中提取数值技巧

js将整数转换成两位浮点数的方法

vue正则限制浮点数

8到20的结尾的正则

shell 正则表达式 最后一行

最新文章

应用程序的安全检测方法、装置、电子设备和存储介质

VBA之正则表达式(1)--基础篇

代码编辑的辅助方法、装置及电子设备

SHELL查字符串中包含字符的命令

String方法中replace和replaceAll的区别详解(源码分析)

双字节符号正则

标签列表

nginx map用法正则

shell 正则表达式最后一行