【发布时间】:2021-01-31 12:04:01
【问题描述】:
在Breast-Cancer-Wisconsin-Diagnostic-DataSet 上使用 Pytorch 练习深度学习进行二元分类。
我尝试了不同的方法,我能得到的最好的方法如下,准确率仍然很低,只有 61%。
提高准确率的方法是什么?
谢谢。
import pandas as pd
import io
dataset = pd.read_excel(base_dir + "Breast-Cancer-Wisconsin-Diagnostic.xlsx")
number_of_columns = dataset.shape[1]
# training and testing split of 70:30
dataset['diagnosis'] = pd.Categorical(dataset['diagnosis']).codes
dataset = dataset.sample(frac=1, random_state=1234)
train_input = dataset.values[:398, :number_of_columns-1]
train_target = dataset.values[:398, number_of_columns-1]
test_input = dataset.values[398:, :number_of_columns-1]
test_target = dataset.values[398:, number_of_columns-1]
import torch
torch.manual_seed(1234)
hidden_units = 5
net = torch.nn.Sequential(
torch.nn.Linear(number_of_columns-1, hidden_units),
torch.nn.ReLU(),
torch.nn.Linear(hidden_units, 2))
# choose optimizer and loss function
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=0.1,momentum=0.9)
# train
epochs = 50
for epoch in range(epochs):
inputs = torch.autograd.Variable(torch.Tensor(train_input).float())
targets = torch.autograd.Variable(torch.Tensor(train_target).long())
optimizer.zero_grad()
out = net(inputs)
loss = criterion(out, targets)
loss.backward()
optimizer.step()
if epoch == 0 or (epoch + 1) % 10 == 0:
print('Epoch %d Loss: %.4f' % (epoch + 1, loss.item()))
# Epoch 1 Loss: 412063.1250
# Epoch 10 Loss: 0.6628
# Epoch 20 Loss: 0.6639
# Epoch 30 Loss: 0.6592
# Epoch 40 Loss: 0.6587
# Epoch 50 Loss: 0.6588
import numpy as np
inputs = torch.autograd.Variable(torch.Tensor(test_input).float())
targets = torch.autograd.Variable(torch.Tensor(test_target).long())
optimizer.zero_grad()
out = net(inputs)
_, predicted = torch.max(out.data, 1)
error_count = test_target.size - np.count_nonzero((targets == predicted).numpy())
print('Errors: %d; Accuracy: %d%%' % (error_count, 100 * torch.sum(targets == predicted) // test_target.size))
# Errors: 65; Accuracy: 61%
【问题讨论】:
-
这是一个很难回答的问题,因为准确度低的原因可能有很多。您是否尝试过构建不同的模型?
-
也许实际上使用了
deep网络?您的模型非常简单,很可能是问题所在。 -
@Smurphy0000,感谢您的评论。我尝试了正常的机器学习,其中一些具有更好的准确性。我在想 Pytorch 应该有更好的准确性,所以我在这里提出了一个问题。
-
@Rika,感谢您的评论。您是否可以展示一个示例,以便其他学习者也可以受益?
标签: python deep-learning pytorch classification