【发布时间】:2020-06-12 14:05:40
【问题描述】:
我的训练数据集中有一个数据样本,如果我打印数据,我可以查看它,但是在访问它来训练数据时,我不断收到RuntimeError: Expected object of scalar type Double but got scalar type Float for argument #2 'weight' in call to _thnn_conv2d_forward。我无法弄清楚为什么会这样。我还在最后附上了一张图片,以便更好地理解错误消息。
labels.txt 文件如下所示(链接到另一个文件夹的图像名称,具有相应的图像、中心点 (x, y) 和半径)
0000, 0.67 , 0.69 , 0.26
0001, 0.69 , 0.33 , 0.3
0002, 0.16 , 0.27 , 0.15
0003, 0.54 , 0.33 , 0.17
0004, 0.32 , 0.45 , 0.3
0005, 0.78 , 0.26 , 0.17
0006, 0.44 , 0.49 , 0.19
编辑:这是我正在使用的损失函数和优化器优化器 =
optim.Adam(model.parameters(), lr=0.001)
nn.CrossEntropyLoss()
我的验证模型功能如下:
def validate_model(model, loader):
model.eval() # eval mode (batchnorm uses moving mean/variance instead of mini-batch mean/variance)
# (dropout is set to zero)
val_running_loss = 0.0
val_running_correct = 0
for int, data in enumerate(loader):
data, target = data['image'].to(device), data['labels'].to(device)
output = model(data)
loss = my_loss(output, target)
val_running_loss = val_running_loss + loss.item()
_, preds = torch.max(output.data, 1)
val_running_correct = val_running_correct+ (preds == target).sum().item()
avg_loss = val_running_loss/len(loader.dataset)
val_accuracy = 100. * val_running_correct/len(loader.dataset)
#----------------------------------------------
# implementation needed here
#----------------------------------------------
return avg_loss, val_accuracy
我有一个计算训练损失的拟合函数:
def fit(model, train_dataloader):
model.train()
train_running_loss = 0.0
train_running_correct = 0
for i, data in enumerate(train_dataloader):
print(data)
#I believe this is causing the error, but not sure why.
data, target = data['image'].to(device), data['labels'].to(device)
optimizer.zero_grad()
output = model(data)
loss = my_loss(output, target)
train_running_loss = train_running_loss + loss.item()
_, preds = torch.max(output.data, 1)
train_running_correct = train_running_correct + (preds == target).sum().item()
loss.backward()
optimizer.step()
train_loss = train_running_loss/len(train_dataloader.dataset)
train_accuracy = 100. * train_running_correct/len(train_dataloader.dataset)
print(f'Train Loss: {train_loss:.4f}, Train Acc: {train_accuracy:.2f}')
return train_loss, train_accuracy
以及以下将损失和准确率存储在列表中的 train_model 函数:
train_losses , train_accuracy = [], []
validation_losses , val_accuracy = [], []
def train_model(model,
optimizer,
train_loader,
validation_loader,
train_losses,
validation_losses,
epochs=1):
"""
Trains a neural network.
Args:
model - model to be trained
optimizer - optimizer used for training
train_loader - loader from which data for training comes
validation_loader - loader from which data for validation comes (maybe at the end, you use test_loader)
train_losses - adding train loss value to this list for future analysis
validation_losses - adding validation loss value to this list for future analysis
epochs - number of runs over the entire data set
"""
#----------------------------------------------
# implementation needed here
#----------------------------------------------
for epoch in range(epochs):
train_epoch_loss, train_epoch_accuracy = fit(model, train_loader)
val_epoch_loss, val_epoch_accuracy = validate_model(model, validation_loader)
train_losses.append(train_epoch_loss)
train_accuracy.append(train_epoch_accuracy)
validation_losses.append(val_epoch_loss)
val_accuracy.append(val_epoch_accuracy)
return
当我运行以下代码时,我得到了运行时错误:
train_model(model,
optimizer,
train_loader,
validation_loader,
train_losses,
validation_losses,
epochs=2)
错误:RuntimeError:标量类型的预期对象 Double 但得到 标量类型浮点数,用于调用参数#2“权重” _thnn_conv2d_forward
这里也是错误消息的屏幕截图: ERROR
编辑:这就是我的模型的样子,我应该在labels.txt文件中检测具有给定中心和半径的图像中的圆圈并在它们上面绘画 - 给出了绘画功能,我已经创建了模型以及培训和验证。
class CircleNet(nn.Module): # nn.Module is parent class
def __init__(self):
super(CircleNet, self).__init__() #calls init of parent class
#----------------------------------------------
# implementation needed here
#----------------------------------------------
#keep dimensions of input image: (I-F+2P)/S +1= (128-3+2)/1 + 1 = 128
#RGB image = input channels = 3. Use 12 filters for first 2 convolution layers, then double
self.conv1 = nn.Conv2d(in_channels=3, out_channels=12, kernel_size=3, stride=1, padding=1)
self.conv2 = nn.Conv2d(in_channels=12, out_channels=12, kernel_size=3, stride=1, padding=1)
self.conv3 = nn.Conv2d(in_channels=12, out_channels=24, kernel_size=3, stride=1, padding=1)
self.conv4 = nn.Conv2d(in_channels=24, out_channels=32, kernel_size=3, stride=1, padding=1)
#Pooling to reduce sizes, and dropout to prevent overfitting
self.pool = nn.MaxPool2d(kernel_size=2)
self.relu = nn.ReLU()
self.drop = nn.Dropout2d(p=0.25)
self.norm1 = nn.BatchNorm2d(12)
self.norm2 = nn.BatchNorm2d(24)
# There are 2 pooling layers, each with kernel size of 2. Output size: 128/(2*2) = 32
# Have 3 output features, corresponding to x-pos, y-pos, radius.
self.fc = nn.Linear(in_features=32 * 32 * 32, out_features=3)
def forward(self, x):
"""
Feed forward through network
Args:
x - input to the network
Returns "x", which is the network's output
"""
#----------------------------------------------
# implementation needed here
#----------------------------------------------
#Conv1
out = self.conv1(x)
out = self.pool(out)
out = self.relu(out)
out = self.norm1(out)
#Conv2
out = self.conv2(out)
out = self.pool(out)
out = self.relu(out)
out = self.norm1(out)
#Conv3
out = self.conv3(out)
out = self.drop(out)
#Conv4
out = self.conv4(out)
out = F.dropout(out, training=self.training)
out = out.view(-1, 32 * 32 * 32)
out = self.fc(out)
return out
编辑:我的自定义损失函数是否有帮助:
criterion = nn.CrossEntropyLoss()
def my_loss(outputs, labels):
"""
Args:
outputs - output of network ([batch size, 3])
labels - desired labels ([batch size, 3])
"""
loss = torch.zeros(1, dtype=torch.float, requires_grad=True)
loss = loss.to(device)
loss = criterion(outputs, labels)
#----------------------------------------------
# implementation needed here
#----------------------------------------------
# Observe: If you need to iterate and add certain values to loss defined above
# you cannot write: loss +=... because this will raise the error:
# "Leaf variable was used in an inplace operation"
# Instead, to avoid this error write: loss = loss + ...
return loss
火车装载机(给我):
train_dir = "./train/"
validation_dir = "./validation/"
test_dir = "./test/"
train_dataset = ShapesDataset(train_dir)
train_loader = DataLoader(train_dataset,
batch_size=32,
shuffle=True)
validation_dataset = ShapesDataset(validation_dir)
validation_loader = DataLoader(validation_dataset,
batch_size=1,
shuffle=False)
test_dataset = ShapesDataset(test_dir)
test_loader = DataLoader(test_dataset,
batch_size=1,
shuffle=False)
print("train loader examples :", len(train_dataset))
print("validation loader examples:", len(validation_dataset))
print("test loader examples :", len(test_dataset))
编辑:还给出了此视图图像、目标圆标签和网络输出:
"""
View first image of a given number of batches assuming that model has been created.
Currently, lines assuming model has been creatd, are commented out. Without a model,
you can view target labels and the corresponding images.
This is given to you so that you may see how loaders and model can be used.
"""
loader = train_loader # choose from which loader to show images
bacthes_to_show = 2
with torch.no_grad():
for i, data in enumerate(loader, 0): #0 means that counting starts at zero
inputs = (data['image']).to(device) # has shape (batch_size, 3, 128, 128)
labels = (data['labels']).to(device) # has shape (batch_size, 3)
img_fnames = data['fname'] # list of length batch_size
#outputs = model(inputs.float())
img = Image.open(img_fnames[0])
print ("showing image: ", img_fnames[0])
labels_str = [ float(("{0:.2f}".format(x))) for x in labels[0]]#labels_np_arr]
#outputs_np_arr = outputs[0] # using ".numpy()" to convert tensor to numpy array
#outputs_str = [ float(("{0:.2f}".format(x))) for x in outputs_np_arr]
print("Target labels :", labels_str )
#print("network coeffs:", outputs_str)
print()
#img.show()
if (i+1) == bacthes_to_show:
break
这是我得到的输出,它应该覆盖整个圆圈: Output I am getting 任何想法都会有所帮助。
【问题讨论】:
-
从堆栈跟踪来看,您的数据看起来应该是浮点数,而是双精度数。您可以尝试将
data设置为data['image'].to(device).float(),并对target做同样的事情吗? -
@kimbo 我确实尝试过,然后我得到了同样的错误,除了它需要一个 Long 类型但发现 float。
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward -
尝试脱掉目标上的
.float() -
@kimbo 不,不幸的是得到了同样的错误,关于预期的 Long 但得到了 Float 类型
-
您在运行
model(data时引发了原始错误。)你的模型是什么样的?
标签: python runtime-error pytorch conv-neural-network training-data