nn_testChess.m
读入数据(one-hot两类情形):
if string(13) == 100
yapp = [yapp,[1,0]'];
else
yapp = [yapp,[0,1]'];
end;
划分数据集分配:
ratioTraining = 0.15; % 15%训练
ratioValidation = 0.05; % 5%验证
ratioTesting = 0.8; % 80%测试
分别对训练、验证、测试数据集进行归一化:
xTraining = (xTraining - repmat(avgX,U,1))./repmat(sigma,U,1);
xValidation = (xValidation - repmat(avgX,U,1))./repmat(sigma,U,1);
xTesting = (xTesting - repmat(avgX,U,1))./repmat(sigma,U,1);
整个神经网络的构建、训练和测试:
nn = nn_create([6,10,10,10,10,10,10,10,10,10,10,2],'active function','relu','learning rate',0.005, 'batch normalization',1,'optimization method','Adam', 'objective function', 'Cross Entropy');
%数组代表了神经网络每一层中的神经元个数。学习率α:learning rate、目标函数:objective function、神经网络的激活函数:active function;输入是6个维度,输出是2个维度,10层,每层有10个神经元。
其他解读:
option.batch_size = 100;%每个mini-batch中有100个训练样本
maxIteration = 10000; %最大训练轮次10000轮
nn = nn_train(nn,option,xTraining,yTraining);%训练
totalCost(iteration) = sum(nn.cost)/length(nn.cost);%测试平均的损失函数
[wrongs,accuracy] = nn_test(nn,xValidation,yValidation); %在验证集测试识别率
深入程序内部:
nn_forward.m
前向计算:
从第K-1层到第K层的输出过程:K-1层输出*权重矩阵ω+偏置
y = nn.W{k-1} * nn.a{k-1} + repmat(nn.b{k-1},1,m);
%repmat(A,m,n)将A复制m×n块
%由于进行批处理,将m组数据存在矩阵同时处理,而对每组数据来说阈值设定是相同的,故将b复制m次
%此处y即为所给推导方法中的z.
经过非线性函数获得第K层输出:
switch nn.active_function%隐层激活函数选择
case 'sigmoid'
nn.a{k} = sigmoid(y);
case 'tanh'
nn.a{k} = tanh(y);
case 'relu'
nn.a{k} = max(y,0);
end
后向传播(根据前面学过的SIGMOID,TANH,SOFTMAX+CROSSENTROPY):
nn_backpropagation.m
switch nn.output_function
case 'sigmoid'
nn.theta{nn.depth} = -(batch_y-nn.a{nn.depth}) .* nn.a{nn.depth} .* (1 - nn.a{nn.depth});
case 'tanh'
nn.theta{nn.depth} = -(batch_y-nn.a{nn.depth}) .* (1 - nn.a{nn.depth}.^2);
case 'softmax'
nn.theta{nn.depth} = nn.a{nn.depth} - batch_y;
end
多层神经网络参数的更新过程:
nn_applygradient.m
if strcmp(nn.optimization_method, 'normal')
nn.W{k} = nn.W{k} - nn.learning_rate*nn.W_grad{k};
nn.b{k} = nn.b{k} - nn.learning_rate*nn.b_grad{k};
与3.7第2页中总结(4)迭代公式一样。
忘记了?看看👇