LSTM 中输入和标签的 PyTorch 张量答案

【问题标题】：PyTorch Tensors of Inputs and Labels in LSTMLSTM 中输入和标签的 PyTorch 张量
【发布时间】：2018-12-29 21:54:15
【问题描述】：

我是 PyTorch 的新手，我正在做一个简单的项目来生成文本，以便掌握 pytorch。我正在使用此代码的概念并将其转换为 PyTorch：https://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/ 我有 10 个时间步长和 990 个样本。对于这 990 个样本中的每一个，都有 10 个值，对应于序列的索引（缩放）。我的输出样本是其余的字母（不包括第一个字母）。例如，如果我的样本是“Hello Worl”，我的对应输出是“ello World”。我的输入大小（特征）是 1，因为我想一次输入一个字母。因此，我的最终输入形状是 (990, 10 , 1). 然后我将输出张量转换成一个热向量，所以最终的形状是 (9900, 42)，其中 42 是一个热向量中的元素个数。当我运行输出时，我可以得到一个形状(9900, 42)，所以这是我所有timesteps的输出，每一个都包含对应的one-hot向量。但是当我计算loss的时候，就报错了：

不支持多目标

我能理解我做错了什么吗？谢谢。下面是我的代码

#The file contains 163780 characters
#The file contains 1000 characters
#There are 42 unique characters

char2int = {char:value for (value,char) in enumerate (unique)}
int2char = {value:char for (value,char) in enumerate (unique)}

learning_rate = 0.01
num_epochs = 5         
input_size = 1              #The number of input neurons (features) to our RNN
units = 100                
num_layers = 2              
num_classes = len(unique)   #The number of output neurons

timesteps = 10
datax = []
datay = []
for index in range(0, len(file) - timesteps, 1):
    prev_letters = file[index:index + timesteps]
    output = file[index + 1: index + timesteps + 1]
    #Convert the 10 previous characters to their integers and put in a list. Append that list to the dataset
    datax.append([char2int[c] for c in prev_letters]) 
    datay.append([char2int[c] for c in output])
print('There are {} Sequences in the dataset'.format(len(datax)))
#There are 990 Sequences in the dataset

x = np.array(datax)
x = x / float(len(unique))
x = torch.FloatTensor(x)
x = x.view(x.size(0), timesteps, input_size)
print(x.shape)   #torch.Size([990, 10, 1])

y = torch.LongTensor(datay)
print(y.shape)   #torch.Size([990, 10])
y_one_hot = torch.zeros(y.shape[0] * y.shape[1], num_classes)
index = y.long()
index = index.view(-1,1)          #The expected shape for the scatter function
y_one_hot.scatter_(1,index,1)    #(dim (1 for along rows and 0 for along cols), index, number to insert)
y_one_hot = y_one_hot.view(-1, num_classes)    # Make the tensor of shape(rows, cols)
y_one_hot = y_one_hot.long()
print(y_one_hot.shape)
#torch.Size([9900, 42])

inputs = Variable(x)
labels = Variable(y_one_hot)

class TextGenerator(nn.Module):
    def __init__(self,input_size,units,num_layers,num_classes,timesteps):
        super(TextGenerator,self).__init__()
        self.units = units
        self.num_layers = num_layers
        self.timesteps = timesteps
        self.input_size = input_size
        # When batch_first=true, inputs are of shape (batch_size/samples, sequence_length, input_dimension)
        self.lstm = nn.LSTM(input_size = input_size, hidden_size = units, num_layers = num_layers, batch_first = True)
        #The output layer 
        self.fc = nn.Linear(units, num_classes)
    def forward(self,x):
        #Initialize the hidden state
        h0 = Variable(torch.zeros(self.num_layers, x.size(0), self.units))
        #Initialize the cell state 
        c0 = Variable(torch.zeros(self.num_layers, x.size(0), self.units))
        out,_ = self.lstm(x, (h0,c0))
        #Reshape the outout from (samples,timesteps,output_features) to a shape appropriate for the FC layer
        out = out.contiguous().view(-1, self.units)
        out = self.fc(out)
        return out

net = TextGenerator(input_size,units,num_layers,num_classes,timesteps)
loss_fn = torch.nn.CrossEntropyLoss() 
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)
out = net(inputs)
out.shape   #(9900, 42)
loss_fn(out,labels)

【问题讨论】：

标签： lstm pytorch

【解决方案1】：

在 PyTorch 中，当使用 CrossEntropyLoss 时，您需要在 [0..n_classes-1] 中将输出标签作为整数而不是单热向量。现在 pytorch 认为您正在尝试预测多个输出。

【讨论】：

谢谢。非常感谢