为什么 unlist() 将字符串列表转换为数字？答案

【问题标题】：Why unlist() convert a list of lists of strings into numbers?为什么 unlist() 将字符串列表转换为数字？
【发布时间】：2021-03-22 18:53:33
【问题描述】：

我正在 R 中进行文本分析。我有一个包含 ngram 的列表。

看起来像这样：

> list_tetragrams[459]
[[1]]
 [1] a small stage show          album of jazz standards     an album of jazz            and play small rooms       
 [5] and release an album        can translate into a        her late s and              i think she’ll wait        
 [9] in her late s               into a small stage          late s and release          maybe something she can    
[13] one can dream right         play small rooms jazz       release an album of         s and release an           
[17] she can translate into      she’ll wait until she’s     she’s in her late           show and play small

我想将此列表列表转换为一个列表。这是我所做的和输出：

Fngram<- list(unlist(unlist(list_tetragrams)))

Output:
 [1]  1  2  3  4  5  6  7  8  9 10 11 12  1  1  2  3  1  1  1  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19
  [39] 20 21 22 23 24 25 26 27 28 29 30 31 32 33  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

代码我用过多次了，第一次出现这种情况。我曾尝试使用 flatten() 函数或 do.all() 函数。都返回相同的输出。发生了什么？有人可以弄清楚吗？谢谢！

【问题讨论】：

有可能你有factor 类.. 为什么你做unlist 两次。默认情况下unlist 有recursive = TRUE。你可以试试list(rapply(list_tetragrams, as.character))
如果您正在查看的列表或向量应该是（推断为字符串但缺少 " 引号，那么它几乎可以肯定是 factor。参见 list(factor("a b")) 与 @ 987654331@。似乎在您上面的代码 sn-p 中，您省略了列出Levels: a small stage show album of jazz standards ... 的输出（这是 imo，它本身就是一个糟糕的表示，因为它对带有嵌入空格的字符串使用空格分隔格式）。
akrun，它奏效了。它以前从未发生过，所以我没有考虑数据类型。欣赏它！

标签： r list text nlp

【解决方案1】：

一个选项是使用递归函数将值从factor 转换为character（整数强制值表明嵌套列表元素是factor 类），默认情况下，@ 中的how = 'unlist' 987654325@)，然后我们用 list 包装这些 vector 以创建单个 list 元素

list(rapply(list_tetragrams, as.character))

【讨论】：