12.反向传播在Affine/Softmax层的实现

By 进击的码农设计师

2019年5月6日

0

1112

1.Affine层的实现：

神经网络的前向传播中，为了计算加权信号的总和，使用了矩阵的乘积运算（numpy中的np.dot()），因为矩阵的乘积运算在几何学领域被称为“仿射变换”，因此，将进行仿射变换的处理实现称为“Affine层”。

假设现在，有三个数组，X、W、B的形状分别为(2,)、(2,3)、(3,)，那么神经网络的加权求和可以用Y=np.dot(X,W)+B计算出来。

计算图表示如下：

现在考虑其反向传播，矩阵的反向传播与之前的标量相同，得到如下式子：

$$
\frac{\partial L}{\partial X}=\frac{\partial L}{\partial Y}\cdot W^T
$$

$$
\frac{\partial L}{\partial Y}=\frac{\partial L}{\partial X}\cdot X^T
$$

其中，W^T是W的转置（转置操作会把W的元素(i,j)变换成元素(j,i)）。
因此反向传播计算图可以表示如下：

可以看出，矩阵乘积的反向传播可以通过组件使矩阵对应维度的元素个数一致的乘积运算而推导出来。

class Affine:
    def __init__(self, W, b):
        self.W =W
        self.b = b
        self.x = None
        self.dW = None
        self.db = None

    def forward(self, x):
        self.x = x
        out = np.dot(self.x, self.W) + self.b
        return out

    def backward(self, dout):
        dx = np.dot(dout, self.W.T)
        self.dW = np.dot(self.x.T, dout)
        self.db = np.sum(dout, axis=0)
        return dx

2.Softmax层的实现：

因为Softmax层一般也包含作为损失函数的交叉熵误差（cross entropy error），所以应该成为“Softmax-with-Loss层”。

Softmax-with-Loss层的计算图可以表示如下（假设此时要进行3类分类）：

其中,输入为(a1,a2,a3)、输出为(y1,y2,y3)、教师标签为(t1,t2,t3)、损失为L。

最后，Softmax层的反向传播得到了结果(y1-t1,y2-t2,y3-t3)。

神经网络学习的目的就是通过调整权重参数，来使神经网络的输出接近教师标签，因此，必须要将神经网络的输出和教师标签的误差高效地传递给前面的层，反向传播得到的结果(y1-t1,y2-t2,y3-t3)正是当前神经网络的输出与教师标签的误差。

class SoftmaxWithLoss:
    def __init__(self):
        self.loss = None
        self.y = None # softmax的输出
        self.t = None # 教师标签（one-hot vector）

    def forward(self, x, t):
        self.t = t
        self.y = softmax(x)
        self.loss = cross_entropy_error(self.y, self.t)
        return self.loss

    def backward(self, dout=1):
        batch_size = self.t.shape[0]
        dx = (self.y - self.t) / batch_size
        return dx

Reference：
《Deep Learning from Scratch》

Previous article11.反向传播在激活层(ReLU和Sigmoid)的实现

Next article13.以MNIST数据集为例实现误差反向传播法

欢迎留下您的宝贵建议 Cancel reply

Please enter your comment!

Please enter your name here

You have entered an incorrect email address!

Please enter your email address here

12.反向传播在Affine/Softmax层的实现

1.Affine层的实现：

2.Softmax层的实现：

【深度学习(PyTorch篇)】49.Tensor...

【深度学习(PyTorch篇)】48.Tensor...

【深度学习(PyTorch篇)】47.可视化工具—...

欢迎留下您的宝贵建议 Cancel reply

Most Popular

【Python计算生态】Dooit——待办事项管理...

【Python内置函数】hex()函数

【Python计算生态】Black——代码格式化工...

【Python内置函数】help()函数

Recent Comments

EDITOR PICKS

RSS

3D Map Generator Terrain

1.ENVI软件操作基础——窗口介绍及打开、浏览数...

POPULAR POSTS

【ArcGIS工具箱】81.栅格综合——边界清理

【ArcGIS小操作】97.基于制图表达属性覆盖实...

【Python数据分析】66.Python建模库介...

POPULAR CATEGORY