【问题标题】:R - not all elements returned (for loop)R - 并非所有元素都返回(for循环)
【发布时间】:2016-01-02 08:07:46
【问题描述】:

以下简单循环似乎跳过了数据框中的元素。如果有任何提示可以找出数据/代码的问题所在,我将不胜感激。

foo <- apply(data, 1, function(x) {

    vec <- x
    mylist <- list()

    for (i in vec){
        #print(i)
        mylist[[i]]<-i
    }
    print(length(vec))
    print(length(mylist))
})

我的数据框有 25 列。对于 一些 行,length(vec) 返回 25,而 length(mylist) 返回 24。

[1] 25
[1] 24

如果我使用散列后的 print(i),我可以在所有行中看到 25 个元素。

以上是我想使用的实际代码的简化,但是这种简单的格式已经出现问题了。

提前致谢!

PS。我尝试过将数据作为字符或因素。两者似乎都不会影响问题。

PPS。给出不同结果的两行数据框(尽管它们包含相同数量的元素):

 structure(list(data1.LOC = c("LL_A1_00000003068_686", "LL_A1_00000003538_274"), REF = c("G", "T"), ALT = c("C", "C"), L47.variant = c("0/1:28,34:62:99:1154,0,926", "0/0:9,0:9:21:0,21,276"), L51.variant = c("0/0:61,0:61:99:0,184,2417", "0/0:6,0:6:15:0,15,192"), LCro11.variant = c("0/0:24,0:24:72:0,72,951", "0/0:2,0:2:6:0,6,80"), LCro5.variant = c("0/0:48,0:48:99:0,141,1869", "0/0:5,0:5:15:0,15,173"), N01.variant = c("0/1:22,16:38:99:526,0,758", "1/1:0,2:2:6:63,6,0"), N09.variant = c("1/1:1,50:51:99:1885,110,0", "0/0:12,0:12:36:0,36,460"), Nor28.variant = c("1/1:0,23:23:66:874,66,0", "0/0:5,0:5:12:0,12,159"), P161.variant = c("1/1:0,54:55:99:2118,163,0", "0/0:2,0:2:6:0,6,80"), Rom155.variant = c("0/0:69,0:69:99:0,208,2749", "0/1:5,3:8:99:102,0,102"), Rom161.variant = c("0/0:75,0:75:99:0,226,2957", "0/0:5,0:5:15:0,15,196"), Rom303.variant = c("0/0:44,0:44:99:0,132,1739", "0/0:5,0:5:15:0,15,195"), Rus291.variant = c("0/1:43,30:73:99:972,0,1443", "0/1:1,3:4:28:108,0,28"), Rus292.variant = c("0/0:56,0:56:99:0,163,2139", "0/0:11,0:11:33:0,33,429"), Sl5t.variant = c("0/1:55,34:89:99:1003,0,1911", "0/0:10,0:10:30:0,30,379"), Sl6t.variant = c("0/0:89,0:89:99:0,268,3513", "0/0:10,0:10:30:0,30,383"), s037y.variant = c("0/0:63,0:63:99:0,190,2484", "0/0:8,0:8:18:0,18,236"), s087y.variant = c("0/0:72,0:72:99:0,211,2770", "0/0:6,0:6:15:0,15,179"), s2E03.variant = c("0/1:34,27:61:99:810,0,1175", "0/0:4,0:4:12:0,12,143"), s2L05.variant = c("0/0:56,0:56:99:0,169,2220", "0/1:4,4:8:95:139,0,95"), s2P01.variant = c("0/1:44,27:71:99:859,0,1519", "0/0:6,0:6:18:0,18,240"), s2R01.variant = c("1/1:0,68:68:99:2642,202,0", "0/1:5,6:11:99:202,0,130"), s2R05.variant = c("0/1:41,33:74:99:1012,0,1393", "0/0:8,0:8:24:0,24,312")), .Names = c("data1.LOC", "REF", "ALT", "L47.variant", "L51.variant", "LCro11.variant", "LCro5.variant", "N01.variant", "N09.variant", "Nor28.variant", "P161.variant", "Rom155.variant", "Rom161.variant", "Rom303.variant", "Rus291.variant", "Rus292.variant", "Sl5t.variant", "Sl6t.variant", "s037y.variant", "s087y.variant", "s2E03.variant", "s2L05.variant", "s2P01.variant", "s2R01.variant", "s2R05.variant"), row.names = 19:20, class = "data.frame")

【问题讨论】:

  • 一般来说,您希望分配给mylist[[i]] &lt;- i 之类的列表,但不确定是否可以解决此问题。此外,虽然简化问题很好,但在这种情况下,提供可重现的示例(以及数据)也很重要:stackoverflow.com/a/28481250/1191259
  • 请与dput()分享您的数据,获得正确的底层结构可能很重要。
  • @stasg 你也可以使用字符索引。
  • vec 中可能有重复项。例如,vec&lt;-c(1,1,2)length(vec)==3 但会导致length(mylist)==2
  • 证实了我的假设。如果x 是您提供的对象,请尝试y&lt;-as.matrix(x);length(y[1,]);length(y[2,])

标签: r for-loop dataframe


【解决方案1】:

而不是使用vec 的元素来访问mylist 的元素,从而在vec 中出现重复的情况下更新相同的元素,您应该通过标准整数索引@987654325 一个接一个地遍历vec @运行它的长度,像这样:

foo <- apply(data, 1, function(x) {

     vec <- x
     mylist <- list()

     for (i in seq(vec)){
        #print(i)
         mylist[[i]] <- vec[i]
}
     print(length(vec))
     print(length(mylist))
})

总而言之,您的代码不起作用,因为:您可能在 vec 中有重复项。例如,vec&lt;-c(1,1,2), length(vec)==3,但它会导致length(mylist)==2。 # nicola 1 小时前的评论

【讨论】:

  • 非常感谢!!这一直让我发疯。我实现了你对大输入文件的建议,它很有魅力
  • 解释变化的文字以及为什么 OP 的方法没有按预期工作会很好......
  • @nicola 上面的评论就是解释。
猜你喜欢
  • 2017-06-04
  • 2019-05-16
  • 1970-01-01
  • 2019-01-21
  • 1970-01-01
  • 1970-01-01
  • 2023-04-09
  • 2020-10-21
  • 1970-01-01
相关资源
最近更新 更多