【发布时间】:2016-01-02 08:07:46
【问题描述】:
以下简单循环似乎跳过了数据框中的元素。如果有任何提示可以找出数据/代码的问题所在,我将不胜感激。
foo <- apply(data, 1, function(x) {
vec <- x
mylist <- list()
for (i in vec){
#print(i)
mylist[[i]]<-i
}
print(length(vec))
print(length(mylist))
})
我的数据框有 25 列。对于 一些 行,length(vec) 返回 25,而 length(mylist) 返回 24。
[1] 25
[1] 24
如果我使用散列后的 print(i),我可以在所有行中看到 25 个元素。
以上是我想使用的实际代码的简化,但是这种简单的格式已经出现问题了。
提前致谢!
PS。我尝试过将数据作为字符或因素。两者似乎都不会影响问题。
PPS。给出不同结果的两行数据框(尽管它们包含相同数量的元素):
structure(list(data1.LOC = c("LL_A1_00000003068_686", "LL_A1_00000003538_274"), REF = c("G", "T"), ALT = c("C", "C"), L47.variant = c("0/1:28,34:62:99:1154,0,926", "0/0:9,0:9:21:0,21,276"), L51.variant = c("0/0:61,0:61:99:0,184,2417", "0/0:6,0:6:15:0,15,192"), LCro11.variant = c("0/0:24,0:24:72:0,72,951", "0/0:2,0:2:6:0,6,80"), LCro5.variant = c("0/0:48,0:48:99:0,141,1869", "0/0:5,0:5:15:0,15,173"), N01.variant = c("0/1:22,16:38:99:526,0,758", "1/1:0,2:2:6:63,6,0"), N09.variant = c("1/1:1,50:51:99:1885,110,0", "0/0:12,0:12:36:0,36,460"), Nor28.variant = c("1/1:0,23:23:66:874,66,0", "0/0:5,0:5:12:0,12,159"), P161.variant = c("1/1:0,54:55:99:2118,163,0", "0/0:2,0:2:6:0,6,80"), Rom155.variant = c("0/0:69,0:69:99:0,208,2749", "0/1:5,3:8:99:102,0,102"), Rom161.variant = c("0/0:75,0:75:99:0,226,2957", "0/0:5,0:5:15:0,15,196"), Rom303.variant = c("0/0:44,0:44:99:0,132,1739", "0/0:5,0:5:15:0,15,195"), Rus291.variant = c("0/1:43,30:73:99:972,0,1443", "0/1:1,3:4:28:108,0,28"), Rus292.variant = c("0/0:56,0:56:99:0,163,2139", "0/0:11,0:11:33:0,33,429"), Sl5t.variant = c("0/1:55,34:89:99:1003,0,1911", "0/0:10,0:10:30:0,30,379"), Sl6t.variant = c("0/0:89,0:89:99:0,268,3513", "0/0:10,0:10:30:0,30,383"), s037y.variant = c("0/0:63,0:63:99:0,190,2484", "0/0:8,0:8:18:0,18,236"), s087y.variant = c("0/0:72,0:72:99:0,211,2770", "0/0:6,0:6:15:0,15,179"), s2E03.variant = c("0/1:34,27:61:99:810,0,1175", "0/0:4,0:4:12:0,12,143"), s2L05.variant = c("0/0:56,0:56:99:0,169,2220", "0/1:4,4:8:95:139,0,95"), s2P01.variant = c("0/1:44,27:71:99:859,0,1519", "0/0:6,0:6:18:0,18,240"), s2R01.variant = c("1/1:0,68:68:99:2642,202,0", "0/1:5,6:11:99:202,0,130"), s2R05.variant = c("0/1:41,33:74:99:1012,0,1393", "0/0:8,0:8:24:0,24,312")), .Names = c("data1.LOC", "REF", "ALT", "L47.variant", "L51.variant", "LCro11.variant", "LCro5.variant", "N01.variant", "N09.variant", "Nor28.variant", "P161.variant", "Rom155.variant", "Rom161.variant", "Rom303.variant", "Rus291.variant", "Rus292.variant", "Sl5t.variant", "Sl6t.variant", "s037y.variant", "s087y.variant", "s2E03.variant", "s2L05.variant", "s2P01.variant", "s2R01.variant", "s2R05.variant"), row.names = 19:20, class = "data.frame")
【问题讨论】:
-
一般来说,您希望分配给
mylist[[i]] <- i之类的列表,但不确定是否可以解决此问题。此外,虽然简化问题很好,但在这种情况下,提供可重现的示例(以及数据)也很重要:stackoverflow.com/a/28481250/1191259 -
请与
dput()分享您的数据,获得正确的底层结构可能很重要。 -
@stasg 你也可以使用字符索引。
-
vec中可能有重复项。例如,vec<-c(1,1,2)、length(vec)==3但会导致length(mylist)==2。 -
证实了我的假设。如果
x是您提供的对象,请尝试y<-as.matrix(x);length(y[1,]);length(y[2,])。