3.9用MATLAB编程构建多层神经网络分类器

nn_testChess.m

读入数据（one-hot两类情形）：

if string(13) == 100
            yapp = [yapp,[1,0]'];
        else
            yapp = [yapp,[0,1]'];
        end;

划分数据集分配：

ratioTraining = 0.15;    % 15%训练
ratioValidation = 0.05;  % 5%验证
ratioTesting = 0.8;      % 80%测试

分别对训练、验证、测试数据集进行归一化：

xTraining = (xTraining - repmat(avgX,U,1))./repmat(sigma,U,1);
xValidation = (xValidation - repmat(avgX,U,1))./repmat(sigma,U,1);
xTesting = (xTesting - repmat(avgX,U,1))./repmat(sigma,U,1);

整个神经网络的构建、训练和测试：

nn = nn_create([6,10,10,10,10,10,10,10,10,10,10,2],'active function','relu','learning rate',0.005, 'batch normalization',1,'optimization method','Adam', 'objective function', 'Cross Entropy');
%数组代表了神经网络每一层中的神经元个数。学习率α：learning rate、目标函数：objective function、神经网络的激活函数：active function；输入是6个维度，输出是2个维度，10层，每层有10个神经元。

其他解读：

option.batch_size = 100;%每个mini-batch中有100个训练样本
maxIteration = 10000; %最大训练轮次10000轮
nn = nn_train(nn,option,xTraining,yTraining);%训练
totalCost(iteration) = sum(nn.cost)/length(nn.cost);%测试平均的损失函数
[wrongs,accuracy] = nn_test(nn,xValidation,yValidation); %在验证集测试识别率

深入程序内部：

nn_forward.m

前向计算：

从第K-1层到第K层的输出过程：K-1层输出*权重矩阵ω+偏置

y = nn.W{k-1} * nn.a{k-1} + repmat(nn.b{k-1},1,m);
%repmat(A,m,n)将A复制m×n块
        %由于进行批处理，将m组数据存在矩阵同时处理，而对每组数据来说阈值设定是相同的，故将b复制m次
        %此处y即为所给推导方法中的z.

经过非线性函数获得第K层输出：

 switch nn.active_function%隐层激活函数选择
                case 'sigmoid'
                    nn.a{k} = sigmoid(y);
                case 'tanh'
                    nn.a{k} = tanh(y);
                case 'relu'
                    nn.a{k} = max(y,0);
            end

后向传播（根据前面学过的SIGMOID,TANH,SOFTMAX+CROSSENTROPY）：

nn_backpropagation.m

switch nn.output_function 
        case 'sigmoid'
            nn.theta{nn.depth} = -(batch_y-nn.a{nn.depth}) .* nn.a{nn.depth} .* (1 - nn.a{nn.depth});
        case 'tanh'
            nn.theta{nn.depth} = -(batch_y-nn.a{nn.depth}) .* (1 - nn.a{nn.depth}.^2);
        case 'softmax'
            nn.theta{nn.depth} = nn.a{nn.depth} - batch_y;
    end

多层神经网络参数的更新过程：

nn_applygradient.m

if strcmp(nn.optimization_method, 'normal')
            nn.W{k} = nn.W{k} - nn.learning_rate*nn.W_grad{k};
            nn.b{k} = nn.b{k} - nn.learning_rate*nn.b_grad{k};

与3.7第2页中总结（4）迭代公式一样。

忘记了？看看👇