【问题标题】:word cloud with unique ID [duplicate]具有唯一 ID 的词云 [重复]
【发布时间】:2017-12-19 04:04:48
【问题描述】:

我有一个包含 2 列的数据集:唯一 ID 和 cmets。 我可以只用 cmets 形成一个词云,但我希望我可以保留每个文本的唯一 ID,这样当我在 Tableau 中可视化结果时我可以重新加入它。

例如

ID  | Text
a1   This is a test comment.
a2   Another test comment.
a3   This is very good
a4   I like this.

我希望的输出是:

ID  |  Words
--    
a1   This
a1   is
a1   a
a1   test
a1   comment
a2   Another
a2   test
a2   comment
a3   This
a3   is
a3   very
a3   good.

我希望你得到我的样品。 谢谢

J

【问题讨论】:

    标签: r tableau-api


    【解决方案1】:
    > df <- read.table(text='ID  Text
    + a1   "This is a test comment"
    + a2   "Another test comment"
    + a3   "This is very good"
    + a4   "I like this"', header=TRUE, as.is=TRUE)
    > 
    > 
    > library(data.table)
    > dt = data.table(df)
    > dt[,c(Words=strsplit(Text, " ", fixed = TRUE)), by = ID]
        ID   Words
     1: a1    This
     2: a1      is
     3: a1       a
     4: a1    test
     5: a1 comment
     6: a2 Another
     7: a2    test
     8: a2 comment
     9: a3    This
    10: a3      is
    11: a3    very
    12: a3    good
    13: a4       I
    14: a4    like
    15: a4    this
    

    【讨论】:

    • 谢谢。这行得通。但是有什么办法可以导出吗?当我尝试 write.csv 时,它会为我导出原始文件。
    • @jols dt
    【解决方案2】:

    你可以做类似的事情

    library(tidyverse)
    df<- tribble(
      ~ID, ~Text,
      "a1",   "This is a test comment.",
      "a2",   "Another test comment.",
      "a3",   "This is very good",
      "a4",   "I like this."
    )
    
    split_data <- strsplit(df$Text, " ")
    
    do.call(rbind,
       lapply(seq_along(unique(df$ID)), function(x) {
            cbind(rep(df$ID[x], length(split_data[[x]])), split_data[[x]])
       })
    )
    

    【讨论】:

      猜你喜欢
      • 2021-05-12
      • 1970-01-01
      • 2015-11-26
      • 2015-09-25
      • 2013-08-03
      • 2020-11-15
      • 2011-11-09
      • 1970-01-01
      相关资源
      最近更新 更多