【发布时间】:2015-08-25 17:32:59
【问题描述】:
我正在尝试让我的第一个神经网络工作,但无论我做什么,网络似乎都无法得到正确的答案。
这是网络达到 0.0001 的 MSE 后的输出
0 XOR 0 = 0.0118003716248665
1 XOR 1 = 0.994320073237859
1 XOR 0 = 0.818618888320916
0 XOR 1 = 0.985995457430471
问题:这些答案不正确。
我创建了一个具有 2 个输入、2 个隐藏神经元和 1 个输出的网络,XOR 问题已使用相同数量解决,因此排除了这种可能性(我猜)。
作为旁注,我从另一个站点上找到的 C# 示例转换了这段代码,C# 代码执行并完美运行,所以这很可能是某个地方的逻辑错误或计算错误:/
现在,很遗憾,我完全找不到导致错误的相关代码,所以我将不得不在此处发布涉及网络的整个代码(抱歉)。
编辑:UpdateWeights() 函数是反向传播,我只是想我会把它放在这里以防万一有人没有注意到它,其余的名称和内容是可以理解的。
unit NeuralNetwork_u;
interface
uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls, ComCtrls, Math;
type TDoubleArray = array of Double;
type TDouble2DArray = array of TDoubleArray;
type TNeuralNetwork = class(TObject)
private
numInput, numHidden, numOutput : Integer;
inputs, hBiases, hSums, hOutputs, oBiases, oSums, Outputs, oGrads, hGrads, hPrevBiasesDelta, oPrevBiasesDelta : TDoubleArray;
ihWeights, hoWeights, ihPrevWeightsDelta, hoPrevWeightsDelta : TDouble2DArray;
public
constructor Create(NumInputs, NumHiddens, NumOutputs : Integer);
procedure SetWeights(weights : TDoubleArray);
function GetWeights : TDoubleArray;
function GetOutputs : TDoubleArray;
function ComputeOutputs( xvalues : TDoubleArray) : TDoubleArray;
function SigmoidFunction( X : Double) : Double;
function HyperTanFunction( X: Double) : Double;
procedure UpdateWeights( tValues : TDoubleArray ; learn, mom : Double);
function Train( TrainData : TDouble2DArray ; MaxEpochs : Integer ; LearningRate, Momentum, DesiredError : Double) : Double;
function WeightCount : Integer;
procedure Shuffle(Seq : array of Integer);
function MeanSquaredError(TrainData : TDouble2DArray) : Double;
end;
type THelper = class(TObject)
public
function MakeMatrix( Rows, Cols : Integer) : TDouble2DArray;
function Error(tValues, yValues : array of Double) : Double;
end;
implementation
uses NetworkInterface_u;
constructor TNeuralNetwork.Create(NumInputs, NumHiddens, NumOutputs : Integer);
var
Helper : THelper;
begin
Helper := THelper.Create;
numInput := NumInputs;
numHidden := NumHiddens;
numOutput := NumOutputs;
SetLength(inputs,numInput);
ihWeights := Helper.MakeMatrix(numInput, numHidden);
SetLength(hBiases,numHidden);
SetLength(hSums, numHidden);
SetLength(hOutputs, numHidden);
howeights := Helper.makeMatrix(numHidden, numOutput);
SetLength(oBiases,numOutput);
SetLength(oSums, numOutput);
SetLength(Outputs, numOutput);
SetLength(oGrads,numOutput);
SetLength(hGrads,numHidden);
ihPrevWeightsDelta := Helper.makeMatrix(numInput,numHidden);
SetLength(hPrevBiasesDelta,numHidden);
hoPrevWeightsDelta := Helper.makeMatrix(numHidden,numOutput);
SetLength(oPrevBiasesDelta,numOutput);
end;
unit NeuralNetwork_u;
interface
uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls, ComCtrls, Math;
type TDoubleArray = array of Double;
type TDouble2DArray = array of TDoubleArray;
type TNeuralNetwork = class(TObject)
private
numInput, numHidden, numOutput : Integer;
inputs, hBiases, hSums, hOutputs, oBiases, oSums, Outputs, oGrads, hGrads, hPrevBiasesDelta, oPrevBiasesDelta : TDoubleArray;
ihWeights, hoWeights, ihPrevWeightsDelta, hoPrevWeightsDelta : TDouble2DArray;
public
constructor Create(NumInputs, NumHiddens, NumOutputs : Integer);
procedure SetWeights(weights : TDoubleArray);
function GetWeights : TDoubleArray;
function GetOutputs : TDoubleArray;
function ComputeOutputs( xvalues : TDoubleArray) : TDoubleArray;
function SigmoidFunction( X : Double) : Double;
function HyperTanFunction( X: Double) : Double;
procedure UpdateWeights( tValues : TDoubleArray ; learn, mom : Double);
function Train( TrainData : TDouble2DArray ; MaxEpochs : Integer ; LearningRate, Momentum, DesiredError : Double) : Double;
function WeightCount : Integer;
procedure Shuffle( var Seq : array of Integer);
function MeanSquaredError(TrainData : TDouble2DArray) : Double;
end;
type THelper = class(TObject)
public
function MakeMatrix( Rows, Cols : Integer) : TDouble2DArray;
function Error(tValues, yValues : array of Double) : Double;
end;
implementation
uses NetworkInterface_u;
constructor TNeuralNetwork.Create(NumInputs, NumHiddens, NumOutputs : Integer);
var
Helper : THelper;
begin
Helper := THelper.Create;
numInput := NumInputs;
numHidden := NumHiddens;
numOutput := NumOutputs;
SetLength(inputs,numInput);
ihWeights := Helper.MakeMatrix(numInput, numHidden);
SetLength(hBiases,numHidden);
SetLength(hSums, numHidden);
SetLength(hOutputs, numHidden);
howeights := Helper.makeMatrix(numHidden, numOutput);
SetLength(oBiases,numOutput);
SetLength(oSums, numOutput);
SetLength(Outputs, numOutput);
SetLength(oGrads,numOutput);
SetLength(hGrads,numHidden);
ihPrevWeightsDelta := Helper.makeMatrix(numInput,numHidden);
SetLength(hPrevBiasesDelta,numHidden);
hoPrevWeightsDelta := Helper.makeMatrix(numHidden,numOutput);
SetLength(oPrevBiasesDelta,numOutput);
end;
procedure TNeuralNetwork.SetWeights(weights : TDoubleArray);
var
numWeights : Integer;
i, k, j : Integer;
begin
numWeights := (numInput * numHidden) + (numHidden * numOutput) + numHidden + numOutput;
if High(weights) <> numWeights then
begin
Raise Exception.Create('The Weights Array Length Does Not match The Total Number Of Weights And Biases - ' + IntToStr(numWeights));
end;
k := 0;
for i := 0 to numInput-1 do
begin
for j := 0 to numHidden-1 do
begin
ihWeights[i][j] := weights[k];
Inc(k);
end;
end;
for i := 0 to numHidden-1 do
begin
hBiases[i] := weights[k];
Inc(k);
end;
for i := 0 to numHidden-1 do
begin
for j := 0 to numOutput-1 do
begin
hoWeights[i][j] := weights[k];
Inc(k);
end;
end;
for i := 0 to numOutput-1 do
begin
oBiases[i] := weights[k];
Inc(k);
end;
end;
function TNeuralNetwork.GetWeights : TDoubleArray;
var
numWeights : Integer;
k, i, j : Integer;
begin
numWeights := (numInput * numHidden) + (numHidden * numOutput) + numHidden + numOutput;
SetLength(Result,numWeights);
k := 0;
for i := 0 to Length(ihWeights)-1 do
begin
for j := 0 to Length(ihWeights[0])-1 do
begin
Result[k] := ihWeights[i][j];
Inc(k);
end;
end;
for i := 0 to Length(hBiases)-1 do
begin
Result[k] := hBiases[i];
Inc(k);
end;
for i := 0 to Length(hoWeights)-1 do
begin
for j := 0 to Length(hoWeights[0])-1 do
begin
Result[k] := hoWeights[i][j] ;
Inc(k);
end;
end;
for i := 0 to Length(oBiases)-1 do
begin
Result[k] := oBiases[i];
Inc(k);
end;
end;
function TNeuralnetwork.GetOutputs : TDoubleArray;
begin
SetLength(Result, numOutput-1);
Result := Outputs;
end;
Function TNeuralNetwork.ComputeOutputs( xValues : TDoubleArray) : TDoubleArray;
var
i, j : Integer;
begin
if Length(xvalues) <> numInput then
begin
raise Exception.Create('Inputs Array Does Not Match Neural Network Inputs Count = Array ' + IntToStr(Length(xValues)) + ' Input Count ' + IntToStr(numInput));
end;
for i := 0 to numHidden-1 do
begin
hSums[i] := 0.0;
end;
for i := 0 to numOutput-1 do
begin
oSums[i] := 0.0;
end;
for i := 0 to Length(xValues)-1 do
begin
inputs[i] := xValues[i];
end;
for j := 0 to numHidden-1 do
begin
for i := 0 to numInput-1 do
begin
hSums[j] := hSums[j] + (inputs[i]*ihWeights[i][j]);
end;
end;
for i := 0 to numHidden-1 do
begin
hSums[i] := hSums[i] + hBiases[i];
end;
for i := 0 to numHidden-1 do
begin
hOutputs[i] := HyperTanFunction(hSums[i]);
end;
for j := 0 to numOutput-1 do
begin
for i := 0 to numHidden-1 do
begin
oSums[j] := oSums[j] + (hOutputs[i] * hoWeights[i][j]);
end;
end;
for i := 0 to numOutput-1 do
begin
oSums[i] := oSums[i] + oBiases[i];
end;
for i := 0 to numOutput-1 do
begin
Outputs[i] := HyperTanFunction(oSums[i]);
end;
Result := Outputs;
end;
function TNeuralNetwork.SigmoidFunction(X : Double) : Double;
begin
if x < -45.0 then
Result := 0
else if x > 45.0 then
Result := 1
else
Result := 1.0 / (1.0 + Exp(-x));
end;
function TNeuralNetwork.HyperTanFunction( X : Double) : Double;
begin
if x < -45.0 then
Result := -1
else if x > 45.0 then
Result := 1
else
Result := Tanh(X);
end;
procedure TNeuralNetwork.UpdateWeights(tValues : TDoubleArray ; learn, mom : Double);
var
i, j : Integer;
derivative, sum, delta,X : Double;
begin
if Length(tValues) <> numOutput then
begin
Raise Exception.Create('Target Values Not Same Length As Output = ' + IntToStr(Length(tValues)) + ' - Outputcount = ' + IntToStr(numOutput));
end;
for i := 0 to Length(oGrads)-1 do
begin
derivative := (1 - outputs[i]) * outputs[i];
oGrads[i] := derivative * (tValues[i] - outputs[i]);
end;
for i := 0 to Length(hGrads)-1 do
begin
derivative := (1 - hOutputs[i]) * (1 + hOutputs[i]);
sum := 0;
for j := 0 to numOutput-1 do
begin
X := oGrads[j] * hoWeights[i][j];
sum := sum + X;
end;
hGrads[i] := derivative * sum;
end;
for i := 0 to Length(ihWeights)-1 do
begin
for j := 0 to Length(ihWeights[0])-1 do
begin
delta := learn * hGrads[j] * inputs[i];
ihWeights[i][j] := ihWeights[i][j] + delta;
ihWeights[i][j] := ihWeights[i][j] + (mom * ihPrevWeightsDelta[i][j]);
ihPrevWeightsDelta[i][j] := delta;
end;
end;
for i := 0 to Length(hBiases)-1 do
begin
delta := learn * hGrads[i] * 1.0;
hBiases[i] := hBiases[i] + delta;
hBiases[i] := hBiases[i] + (mom * hPrevBiasesDelta[i]);
hPrevBiasesDelta[i] := delta;
end;
for i := 0 to Length(hoWeights)-1 do
begin
for j := 0 to Length(hoWeights[0])-1 do
begin
delta := learn * oGrads[j] * hOutputs[i];
hoWeights[i][j] := hoWeights[i][j] + delta;
hoWeights[i][j] := hoWeights[i][j] + (mom * hoPrevWeightsDelta[i][j]);
hoPrevWeightsDelta[i][j] := delta;
end;
end;
for i := 0 to Length(oBiases)-1 do
begin
delta := learn * oGrads[i] * 1.0;
oBiases[i] := oBiases[i] + delta;
oBiases[i] := oBiases[i] + (mom * oPrevBiasesDelta[i]);
oPrevBiasesDelta[i] := delta;
end;
end;
function TNeuralNetwork.Train( TrainData : TDouble2DArray ; MaxEpochs : Integer ; LearningRate, Momentum, DesiredError : Double) : Double;
var
Epoch, I, Idx, c : Integer;
xValues : TDoubleArray;
tValues : TDoubleArray;
Sequence : Array of Integer;
MeanSquaredErrorr : Double;
Helper : THelper;
begin
Epoch := 0;
SetLength(xValues,numInput);
SetLength(tValues,numOutput+1);
SetLength(Sequence,Length(TrainData));
for I := 0 to Length(Sequence)-1 do
Sequence[I] := I;
Shuffle(Sequence);
while Epoch < MaxEpochs do
begin
frmNetworkInterface.redTraining.Lines.Add('Current Epoch - ' + IntToStr(Epoch) + ' : error = ' + FloatToStr(MeanSquaredErrorr) + ' and Desired Error is = ' + FloatToStr(DesiredError));
Application.ProcessMessages;
MeanSquaredErrorr := MeanSquaredError(TrainData);
if MeanSquaredErrorr < DesiredError then
Break;
for I := 0 to Length(TrainData)-1 do
begin
Idx := Sequence[i];
xValues := Copy(TrainData[Idx],0,numInput);
tValues := Copy(TrainData[Idx],numInput,numOutput);
ComputeOutputs(xValues);
UpdateWeights(tValues,LearningRate,Momentum);
end;
Inc(Epoch);
Result := MeanSquaredErrorr;
end;
end;
procedure TNeuralNetwork.Shuffle( var Seq : array of Integer);
var
I, R, Tmp : Integer;
begin
for I := 0 to Length(Seq)-1 do
begin
R := RandomRange(i,Length(Seq));
Tmp := Seq[i];
Seq[R] := Seq[I];
Seq[I] := Tmp;
end;
end;
function TNeuralNetwork.MeanSquaredError(TrainData : TDouble2DArray) : Double;
var
sumSquaredError, err : Double;
xValues, tValues, yValues : TDoubleArray;
I, J : Integer;
begin
sumSquaredError := 0;
SetLength(xValues,numInput);
SetLength(tvalues,numOutput);
for I := 0 to Length(TrainData)-1 do
begin
xValues := Copy(TrainData[I],0,numInput);
tValues := Copy(TrainData[I],numInput,numOutput);
yValues := ComputeOutputs(xValues);
for J := 0 to numOutput-1 do
begin
err := tValues[j] - yValues[j];
sumSquaredError := sumSquaredError + (err * err);
end;
end;
Result := sumSquaredError / Length(TrainData);
end;
function TNeuralNetwork.WeightCount : Integer;
begin
Result := (numInput * numHidden) + (numHidden * numOutput) + numHidden + numOutput;
end;
function THelper.MakeMatrix(Rows, Cols : Integer) : TDouble2DArray;
begin
SetLength(Result,Rows,Cols);
end;
function THelper.Error(tValues : array of Double ; yValues : array of Double) : Double;
var
sum : Double;
i : Integer;
begin
sum := 0.0;
for i := 0 to High(tValues)-1 do
begin
sum := sum + ((tValues[i] - yValues[i]) * (tValues[i] - yValues[i]));
end;
Result := Sqrt(sum);
end;
end.
我已经通过这段代码将近一百次了,没有答案,没有发现逻辑错误或计算错误,但是,据我所知,C# 示例有效,这也应该。
编辑: 观察:在我看来,每当我传入的第二个值是 1 时,网络会自动使输出变得太大(第二个输入所涉及的权重值对我来说太大了?),因此 1 XOR 1是错误的,因为第二个值是 1(见上面的数据)。
编辑: 这是我刚刚运行的一个网络的初始权重(2 个输入,2 个隐藏,1 个输出)
Initial Weight0 - 0.0372207039175555 Initial Weight1 - 0.01092082898831 Initial Weight2 - 0.0755334409791976 Initial Weight3 - 0.0866588755254634 Initial Weight4 - 0.0626101282471791 Initial Weight5 - 0.0365478269639425 Initial Weight6 - 0.0724486718699336 Initial Weight7 - 0.0320405319170095 Initial Weight8 - 0.0680674042692408
在 132 个 Epochs 之后(误差为 0.001)
Final Weight 0 = 0.432341693850932 Final Weight 1 = 0.338041456780997 Final Weight 2 = 1.0096817584107 Final Weight 3 = 0.839104863469981 Final Weight 4 = -0.275763414588823 Final Weight 5 = -0.171414938983027 Final Weight 6 = 1.26394969109634 Final Weight 7 = 0.998915778388676 Final Weight 8 = 0.549501870374428
编辑:所以一个新的发展已经浮出水面,传递 TrainingData 时出现错误导致它识别 1 XOR 1 = 1,但是,在修复此错误后,网络无法收敛于答案(运行 100 个网络,每个 10,000 个时期)我得到的最低 MSE(均方误差)是
当前纪元 - 9999:误差 = 0.487600332892658 和期望误差 = 0.001
我记录了每个训练阶段发送到网络的输入和输出,并确定它们现在都是正确的,所以现在看来网络无法解决问题?
另外,我正在将代码更新到我的最新版本。 (08/26/2015)
此代码中的新内容:
修复了 1 而不是 0 的复制索引。
现在可以确认 Inputs 和 Desired output 已正确复制。
编辑:网络的 MSE 现在实际上正在增加,这是最初的错误:
0.467486419821747,
在 10000 个 Epochs 之后,
0.487600332892658,
整体误差随着
的增加而增加0.020113913070917
...这让我相信我的训练程序或 UpdateWeights 程序有问题...
编辑:我所做的另一个观察是,网络的均方误差在 2.5 上达到上限(当运行一个非常长的训练课程以使其移动那么多时)该死的 MSE 正在上升而不是下降?
编辑:训练时网络输出的另一个观察
当前 Epoch - 233 : error = 0.802251346201161 和 Desired Error is = 0.0001
当前 Epoch - 234 : error = 1.24798705066641 和 Desired Error is = 0.0001
当前 Epoch - 235 : error = 2.47206076545025 和 Desired Error is = 0.0001
当前 Epoch - 236 : error = 2.49999999811955 和 Desired Error = 0.0001
从 1.24 急剧跃升至 2.49,网络显然在与训练或权重变化有关的函数中出现错误。
【问题讨论】:
-
您确定您的 Delphi 代码忠实地再现了 C# 代码中的计算吗?您是否验证过 C# 和 D 中单个神经元对于相同输入的输出是相同的?如果是这样,那么除非有人能在您的 Delphi 代码中发现明显的错误,否则您可能需要在两个版本中编写一些日志记录代码并并排跟踪它们以查看它们的不同之处。
-
@MartynA 问题在于,输出不一样,但它们的编程相同(我做了很多超过三倍的确定),如果有帮助,我可以链接到c# 代码,或者也在这里发布?据我所知(基于实验)c# 和 delphi 中的网络训练成功,MSE 不断下降,但我的输出与 c# 的输出不同
-
让我检查一下,我会回复你,从没想过这种可能性
-
刚刚确认数组在两种语言中的工作方式完全相同 [row] [column]
-
> 我创建了一个有 2 个输入、2 个隐藏神经元和 1 个输出的网络,XOR 问题已经用相同的数量解决了,所以排除了这种可能性(我猜)关于这个,没有,这不排除。仅仅因为您可以手动找到正确的权重,并不意味着反向传播也可以找到它们。这可能取决于您的元参数设置。我建议你尝试 3 个隐藏的神经元来排除这种情况(反向传播更容易)。然后你就会知道你的代码有没有问题。
标签: delphi neural-network