神经⽹络例程-使⽤(3-1)结构的神经⽹络实现与、或、异或
三种逻辑运算
以下代码来⾃Deep Learning for Computer Vision with Python第⼗章。
本例程需要在同⼀⽂件内新建四个⽂件。分别是1、perceptron.py;2、perceptron_or.py;3、perceptron_and.py;4、
perceptron_xor.py。
1、perceptron.py
# import the necessary packages
import numpy as np
class Perceptron:
def __init__(self, N, alpha=0.1):
# initialize the weight matrix and store the learning rate
self.W = np.random.randn(N + 1) / np.sqrt(N)
self.alpha = alpha
def step(self, x):
# apply the step function
return 1 if x > 0 else 0
def fit(self, X, y, epochs=10):
# insert a column of 1's as the last entry in the feature
# matrix -- this little trick allows us to treat the bias
# as a trainable parameter within the weight matrix
X = np.c_[X, np.ones((X.shape[0]))]
# loop over the desired number of epochs
for epoch in np.arange(0, epochs):
# loop over each individual data point
for (x, target) in zip(X, y):
# take the dot product between the input features
# and the weight matrix, then pass this value
# through the step function to obtain the prediction
p = self.step(np.dot(x, self.W))
#print("[training] self.W={}, x={}, target={}".format(self.W, x, target))
# only perform a weight update if our prediction
# does not match the target
if p != target:
# determine the error
error = p - target
# update the weight matrix
self.W += -self.alpha * error * x
def predict(self, X, addBias=True):
# ensure our input is a matrix
X = np.atleast_2d(X)
# check to see if the bias column should be added
if addBias:
# insert a column of 1's as the last entry in the feature
# matrix (bias)
X = np.c_[X, np.ones((X.shape[0]))]
# take the dot product between the input features and the
# weight matrix, then pass the value through the step
# function
return self.step(np.dot(X, self.W))
分析:Perception类是⼀个(3-1)结构的神经⽹络,(3-1)代表有输⼊层有3个神经元(其中两个神经元⽤于处理输⼊参数x1和x2,另外⼀个神经元输⼊固定为1),输出层有1个神经元。⽰意图见下图。
神经⽹络权重⽂件除了权重(w1和w2),还加上了偏置(b)。本⾝输⼊参数只有两个,对应的权值是w1和w2,输⼊层神经元为3的⽬的是把偏置b也加⼊权重矩阵中。当训练权重矩阵时,偏置b也得以更新。输出参数的表达式是y=step(x1*w1+x2*w2+b)。
神经元的激活函数是Step函数。Step函数包含⼀个输⼊参数和⼀个输出参数。当输⼊⼩于0,则输出为0;当输⼊⼤于0,则输出1。
fit函数⽤于训练,使⽤的是随机梯度下降法。predict函数作⽤是测试样品,把⽬标样品经过本神经⽹络,获得预测结果。
2、perceptron_or.py
# import the necessary packages
from perceptron import Perceptron
import numpy as np
# construct the OR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [1]])
# define our perceptron and train it
print("[INFO] ")
p = Perceptron(X.shape[1], alpha=0.1)
p.fit(X, y, epochs=20)
construct用法# now that our perceptron is trained we can evaluate it
print("[INFO] ")
# now that our network is trained, loop over the data points
for (x, target) in zip(X, y):
# make a prediction on the data point and display the result
# to our console
pred = p.predict(x)
print("[INFO] data={}, ground-truth={}, pred={}".format(
x, target[0], pred))
在或运算中,两个输⼊参数和⼀个输出参数的关系见下表。
逻辑运算-或
输⼊参数1:x1输⼊参数2:x2输出参数:y
000
011
101
111
Perceptron函数⽤于新建⼀个2层神经⽹络。第⼀个输⼊参数X.shape[1]是X中每个样品的参数个数。alpha是梯度下降、更新权值的速度。越接近1,速度越快,但是越⼤越容易错过局部最⼤值。
fit函数第⼀个参数是样品的输⼊参数矩阵,第⼆个参数是样品的输⼊参数的输出矩阵(真实值),第三个参数是迭代次数。data是样品的输⼊参数,ground-truth表⽰真实值,pred是预测结果。
⽤python运⾏perceptron_or.py,可得到以下结果:
============= RESTART: E:\FENG\workspace_python\perceptron_or.py =============
[INFO]
[INFO]
[INFO] data=[0 0], ground-truth=0, pred=0
[INFO] data=[0 1], ground-truth=1, pred=1
[INFO] data=[1 0], ground-truth=1, pred=1
[INFO] data=[1 1], ground-truth=1, pred=1
3、perceptron_and.py
# import the necessary packages
from perceptron import Perceptron
import numpy as np
# construct the OR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [0], [0], [1]])
# define our perceptron and train it
print("[INFO] ")
p = Perceptron(X.shape[1], alpha=0.1)
p.fit(X, y, epochs=20)
# now that our perceptron is trained we can evaluate it
print("[INFO] ")
# now that our network is trained, loop over the data points
for (x, target) in zip(X, y):
# make a prediction on the data point and display the result
# to our console
pred = p.predict(x)
print("[INFO] data={}, ground-truth={}, pred={}".format(
x, target[0], pred))
这个⽂件和上⾯那份⽂件差别不⼤,因此不分析了。
4、perceptron_xor.py
# import the necessary packages
from perceptron import Perceptron
import numpy as np
# construct the OR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])
# define our perceptron and train it
print("[INFO] ")
p = Perceptron(X.shape[1], alpha=0.1)
p.fit(X, y, epochs=20)
# now that our perceptron is trained we can evaluate it
print("[INFO] ")
# now that our network is trained, loop over the data points
for (x, target) in zip(X, y):
# make a prediction on the data point and display the result
# to our console
pred = p.predict(x)
print("[INFO] data={}, ground-truth={}, pred={}".format(
x, target[0], pred))
结果如下:
============ RESTART: E:\FENG\workspace_python\perceptron_xor.py ============
[INFO]
[INFO]
[INFO] data=[0 0], ground-truth=0, pred=1
[INFO] data=[0 1], ground-truth=1, pred=1
[INFO] data=[1 0], ground-truth=1, pred=0
[INFO] data=[1 1], ground-truth=0, pred=0
可见,异或的预测结果并不准确。主要因为,只具有2层神经元、⽽不具备隐含层的神经⽹络并⽆法⾮线性的对样品分类。
上图说明的是,与和或样本的空间分布,因为可以⽤⼀条直线把输出0和输出1的样本分类,因此⽐较简单。经过实践发现,使⽤2层神经⽹络的分类器也能实现预期效果。但是,异或的样本逻辑⽐较复杂。
为了正确分类异或样本,必须改变神经⽹络结构,进⼀步增加隐含层,尝试重新训练以及测试。
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论