【发布时间】:2011-08-12 07:29:59
【问题描述】:
如何使用R将复数名词转换为单数名词?我使用 tagPOS 函数标记每个文本,然后提取所有标记为“NNS”的复数名词。但是,如果我想将这些复数名词转换为单数,该怎么办?
library("openNLP")
library("tm")
acq_o <- "Gulf Applied Technologies Inc said it sold its subsidiaries engaged in pipelines and terminal operations for 12.2 mln dlrs. The company said the sale is subject to certain post closing adjustments, which it did not explain. Reuter."
acq = tm_map(Corpus(DataframeSource(data.frame(acq_o))), removePunctuation)
acqTag <- tagPOS(acq)
acqTagSplit = strsplit(acqTag," ")
qq = 0
tag = 0
for (i in 1:length(acqTagSplit[[1]])){
qq[i] <-strsplit(acqTagSplit[[1]][i],'/')
tag[i] = qq[i][[1]][2]
}
index = 0
k = 0
for (i in 1:(length(acqTagSplit[[1]]))) {
if (tag[i] == "NNS"){
k = k +1
index[k] = i
}
}
index
【问题讨论】:
-
感谢 Aleksandar Dimitrov 和 tchrist 的 cmets。也许我必须编写自己的单数化规则。对于对此问题感兴趣的每个人,这里有一个有用的在线材料:英语复数的算法方法。如果有进一步的答案,请指导我。谢谢