好吧,正如您从这里看到的:Make all words uppercase in Wordcloud in R,当您执行TermDocumentMatrix(CORPUS) 时,默认情况下单词变为小写。
确实,如果在没有参数freq 的情况下执行trace(wordcloud),则会执行tdm <- tm::TermDocumentMatrix(corpus),因此您的单词会变为小写。
您有两种选择来解决这个问题:
包括单词和频率而不是语料库:
filePath <- "http://www.sthda.com/sthda/RDoc/example-files/martin-luther-king-i-have-a-dream-speech.txt" # I am using this text because you DID NOT PROVIDED A REPRODUCIBLE EXAMPLE
text <- readLines(filePath)
products <- Corpus(VectorSource(text))
products <- tm_map(products, toupper)
c_words <- brewer.pal(8, 'Set2')
tdm <- tm::TermDocumentMatrix(products, control = list(tolower = F))
freq_corpus <- slam::row_sums(tdm)
wordcloud(names(freq_corpus), freq_corpus, min.freq = 10, max.words = 30, scale = c(7,1), colors = c_words)
你会得到:
第二个选项是修改wordcloud:
首先你做trace(worcloud, edit=T),然后将第21行替换为:
tdm <- tm::TermDocumentMatrix(corpus, control = list(tolower = F))
点击保存并执行:
filePath <- "http://www.sthda.com/sthda/RDoc/example-files/martin-luther-king-i-have-a-dream-speech.txt"
text <- readLines(filePath)
products <- Corpus(VectorSource(text))
products <- tm_map(products, toupper)
c_words <- brewer.pal(8, 'Set2')
wordcloud(names(freq_corpus), freq_corpus, min.freq = 10, max.words = 30, scale = c(7,1), colors = c_words)
你会得到类似的东西: