【发布时间】:2026-01-10 23:40:01
【问题描述】:
我从 R 中的 Keras 开始,想建立一个文本分类模型。但是,我遇到了一个错误,这很可能是由于我对深度学习和 Keras 的理解有限。任何帮助都会很棒。下面分享代码。代码 sn-p 中的数据是有限的,因此它可以被专家尽快复制。
library(keras)
library(tm)
data <- data.frame("Id" = 1:10, "Text" = c("the cat was mewing","the cat was black in color","the dog jumped over the wall","cat cat cat everywhere","dog dog cat play style","cat is white yet it is nice","dog is barking","cat sweet","angry dog","cat is nice nice nice"), "Label" = c(1,1,2,1,2,1,2,1,2,1))
corpus <- VCorpus(VectorSource(data$Text))
tdm <- DocumentTermMatrix(corpus, list(removePunctuation = TRUE, stopwords = TRUE,removeNumbers = TRUE))
data_t <- as.matrix(tdm)
data <- cbind(data_t,data$Label)
dimnames(data) = NULL
#Normalize data
data[,1:(ncol(data)-1)] = normalize(data[,1:(ncol(data)-1)])
data[,ncol(data)] = as.numeric(data[,ncol(data)]) - 1
set.seed(123)
ind = sample(2,nrow(data),replace = T,prob = c(0.8,0.2))
training = data[ind==1,1:(ncol(data)-1)]
test = data[ind==2,1:(ncol(data)-1)]
traintarget = data[ind==1,ncol(data)]
testtarget = data[ind==2,ncol(data)]
# One hot encoding
trainLabels = to_categorical(traintarget)
testLabels = to_categorical(testtarget)
print(testLabels)
#Create sequential model
model = keras_model_sequential()
model %>%
layer_dense(units=8,activation='relu',input_shape=c(16))
summary(model)
model %>%
compile(loss='categorical_crossentropy',optimizer='adam',metrics='accuracy')
history = model %>%
fit(training,
trainLabels,
epoch=200,
batch_size=2,
validation_split=0.2)
在这个例子中,一个热编码可能是不必要的。除此之外,我可能有几个地方出错了。但是,代码的最后一行给我一个形状错误。由于我的数据中有 16 列,所以我使用 shape 为 16。
我得到的错误是
py_call_impl(callable, dots$args, dots$keywords) 中的错误: ValueError: 检查目标时出错:预期dense_32 的形状为(None, 8),但得到的数组的形状为(7, 2)
这方面的任何指导都会很有帮助
【问题讨论】:
标签: r keras text-classification