【发布时间】:2020-10-20 12:12:17
【问题描述】:
首先,我创建了一个自定义数据集以从我的数据帧中加载图像(包含图像文件路径和相应的 int 标签):
class Dataset(torch.utils.data.Dataset):
def __init__(self, dataframe, transform=None):
self.frame = dataframe
self.transform = transform
def __len__(self):
return len(self.frame)
def __getitem__(self, idx):
if torch.is_tensor(idx):
idx = idx.tolist()
filename = self.frame.iloc[idx, 0]
image = torch.from_numpy(io.imread(filename).transpose((2, 0, 1))).float()
label = self.frame.iloc[idx, 1]
sample = {'image': image, 'label': label}
if self.transform:
sample = self.transform(sample)
return sample
然后,我使用预先存在的模型架构,如下所示:
model = models.densenet161()
num_ftrs = model.classifier.in_features
model.classifier = nn.Linear(num_ftrs, 10) # where 10 is my number of classes
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
最后,为了训练,我做了以下事情:
model.train() # switch to train mode
for epoch in range(5):
for i, sample in enumerate(train_set): # where train_set is an instance of my Dataset class
optimizer.zero_grad()
image, label = sample['image'].unsqueeze(0), torch.Tensor(sample['label']).long()
output = model(image)
loss = criterion(output, label)
loss.backward()
optimizer.step()
但是,loss = criterion(output, label) 出现错误。它告诉我ValueError: Expected input batch_size (1) to match target batch_size (2).。有人可以教我如何正确使用自定义数据集,尤其是批量加载数据吗?另外,为什么我会遇到 ValueError?谢谢!
【问题讨论】:
-
如何从
Dataset类中构造train_set? -
@Shawn Zhang 不要在你的 getitem 的末尾返回一个字典,就像这样
image_tens = self.transforms(image) return image_tens, torch.tensor(labels)
标签: pytorch