【问题标题】:creating data frame from subsets and excluding data从子集创建数据框并排除数据
【发布时间】:2017-12-24 02:40:42
【问题描述】:

这些是我的两个数据帧的 head()(我有几个,但具有不同的 EXPANSION(骨骼):

                        CEMETERY CONTEXT    SEX EXPANSION VALUE
613     Medieval-St. Mary Graces    7172 FEMALE    HuL1 L   285
681     Medieval-St. Mary Graces    7223   MALE    HuL1 L   310
860     Medieval-St. Mary Graces    7314   MALE    HuL1 L   357
1301    Medieval-St. Mary Graces    8102   MALE    HuL1 L   323
1441    Medieval-St. Mary Graces    8117 FEMALE    HuL1 L   316
1575    Medieval-St. Mary Graces    8207   MALE    HuL1 L   326
1655    Medieval-St. Mary Graces    8268 FEMALE    HuL1 L   292
1902    Medieval-St. Mary Graces    9362 FEMALE    HuL1 L   283
1932    Medieval-St. Mary Graces    9373   MALE    HuL1 L   316
2368    Medieval-St. Mary Graces    9813   MALE    HuL1 L   320
2947    Medieval-St. Mary Graces   10145   MALE    HuL1 L   320
3033    Medieval-St. Mary Graces   10218   MALE    HuL1 L   320
3062    Medieval-St. Mary Graces   10241   MALE    HuL1 L   341
3159    Medieval-St. Mary Graces   10420   MALE    HuL1 L   327
3294    Medieval-St. Mary Graces   11005   MALE    HuL1 L   304
3471    Medieval-St. Mary Graces   11090 FEMALE    HuL1 L   309
3723    Medieval-St. Mary Graces   11494   MALE    HuL1 L   324
4128    Medieval-St. Mary Graces   12356   MALE    HuL1 L   319
4206    Medieval-St. Mary Graces   12414   MALE    HuL1 L   323
4344    Medieval-St. Mary Graces   12493   MALE    HuL1 L   325
4421    Medieval-St. Mary Graces   12520   MALE    HuL1 L   325
4470    Medieval-St. Mary Graces   12525   MALE    HuL1 L   347
4837    Medieval-St. Mary Graces   12761   MALE    HuL1 L   322
4948    Medieval-St. Mary Graces   12785   MALE    HuL1 L   335
5072    Medieval-St. Mary Graces   13530   MALE    HuL1 L   341
5317    Medieval-St. Mary Graces   13747   MALE    HuL1 L   337
5840      Medieval-Spital Square      19 FEMALE    HuL1 L   326
5927      Medieval-Spital Square      22   MALE    HuL1 L   330
6044      Medieval-Spital Square      31   MALE    HuL1 L   328
6177      Medieval-Spital Square      95   MALE    HuL1 L   316
6336      Medieval-Spital Square     298   MALE    HuL1 L   347
6725      Medieval-Spital Square     349 FEMALE    HuL1 L   310
6827      Medieval-Spital Square     358   MALE    HuL1 L   336
6959      Medieval-Spital Square     383 FEMALE    HuL1 L   319
7105      Medieval-Spital Square     391   MALE    HuL1 L   352
7167      Medieval-Spital Square     394   MALE    HuL1 L   317
7322      Medieval-Spital Square     430   MALE    HuL1 L   318
7765 Medieval-St. Benet sherehog    1511 FEMALE    HuL1 L   296
7808 Medieval-St. Benet sherehog    1566   MALE    HuL1 L   314

                        CEMETERY CONTEXT    SEX EXPANSION VALUE
166     Medieval-St. Mary Graces    6225   MALE    HuL1 R   346
345     Medieval-St. Mary Graces    6351   MALE    HuL1 R   330
612     Medieval-St. Mary Graces    7172 FEMALE    HuL1 R   286
660     Medieval-St. Mary Graces    7202   MALE    HuL1 R   340
1214    Medieval-St. Mary Graces    8016   MALE    HuL1 R   334
1348    Medieval-St. Mary Graces    8111 FEMALE    HuL1 R   308
1440    Medieval-St. Mary Graces    8117 FEMALE    HuL1 R   320
1574    Medieval-St. Mary Graces    8207   MALE    HuL1 R   326
2205    Medieval-St. Mary Graces    9543   MALE    HuL1 R   326
2508    Medieval-St. Mary Graces    9901   MALE    HuL1 R   354
2731    Medieval-St. Mary Graces    9987   MALE    HuL1 R   324
2778    Medieval-St. Mary Graces   10058   MALE    HuL1 R   345
2832    Medieval-St. Mary Graces   10070   MALE    HuL1 R   360
3032    Medieval-St. Mary Graces   10218   MALE    HuL1 R   325
3061    Medieval-St. Mary Graces   10241   MALE    HuL1 R   341
3236    Medieval-St. Mary Graces   10801   MALE    HuL1 R   344
3470    Medieval-St. Mary Graces   11090 FEMALE    HuL1 R   312
3655    Medieval-St. Mary Graces   11475   MALE    HuL1 R   339
3722    Medieval-St. Mary Graces   11494   MALE    HuL1 R   334
4205    Medieval-St. Mary Graces   12414   MALE    HuL1 R   327
4298    Medieval-St. Mary Graces   12480   MALE    HuL1 R   318
4343    Medieval-St. Mary Graces   12493   MALE    HuL1 R   325
4420    Medieval-St. Mary Graces   12520   MALE    HuL1 R   331
4469    Medieval-St. Mary Graces   12525   MALE    HuL1 R   342
4947    Medieval-St. Mary Graces   12785   MALE    HuL1 R   338
5244    Medieval-St. Mary Graces   13678   MALE    HuL1 R   342
5288    Medieval-St. Mary Graces   13724 FEMALE    HuL1 R   319
5316    Medieval-St. Mary Graces   13747   MALE    HuL1 R   340
5374    Medieval-St. Mary Graces   13825   MALE    HuL1 R   349
5839      Medieval-Spital Square      19 FEMALE    HuL1 R   332
5926      Medieval-Spital Square      22   MALE    HuL1 R   338
6043      Medieval-Spital Square      31   MALE    HuL1 R   328
6176      Medieval-Spital Square      95   MALE    HuL1 R   316
6245      Medieval-Spital Square     269   MALE    HuL1 R   339
6288      Medieval-Spital Square     287 FEMALE    HuL1 R   282
6335      Medieval-Spital Square     298   MALE    HuL1 R   352
6410      Medieval-Spital Square     309   MALE    HuL1 R   332
6724      Medieval-Spital Square     349 FEMALE    HuL1 R   313
6826      Medieval-Spital Square     358   MALE    HuL1 R   340
6958      Medieval-Spital Square     383 FEMALE    HuL1 R   322
7104      Medieval-Spital Square     391   MALE    HuL1 R   355
7166      Medieval-Spital Square     394   MALE    HuL1 R   322
7321      Medieval-Spital Square     430   MALE    HuL1 R   325
7404      Medieval-Spital Square     472   MALE    HuL1 R   346
7502 Medieval-St. Benet sherehog      67   MALE    HuL1 R   339

我需要排除任何不具有骨骼左侧 (L) 和 (R) 测量值的 CONTEXT(标本)。我已经为这些数据框制作了 CONTEXT 的子集

HuL1L.id=HuL1L$CONTEXT
HuL1R.id=HuL1R$CONTEXT

并打算使用布尔运算符 %in% 来找出其中一个向量中的哪些个体也在另一个向量中

HuL1L.id%in%HuL1Rframe.id

[1]  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE
[11] FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE
[21]  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
[31]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE

但我不确定除此之外还能做什么 - 比如如何用这些数据实际创建一个数据框,如下所示:

                         CEMETERY CONTEXT    SEX EXPANSION VALUE
613     Medieval-St. Mary Graces    7172 FEMALE    HuL1 L   285
612     Medieval-St. Mary Graces    7172 FEMALE    HuL1 R   286
1441    Medieval-St. Mary Graces    8117 FEMALE    HuL1 L   316
1440    Medieval-St. Mary Graces    8117 FEMALE    HuL1 R   320
1575    Medieval-St. Mary Graces    8207   MALE    HuL1 L   326
1574    Medieval-St. Mary Graces    8207   MALE    HuL1 R   326

然后对我的其他骨骼重复此操作,最后合并所有这些数据帧。

编辑:

使用:

HuL1R <- HuL1R %>% filter(CONTEXT %in% Hul1L$CONTEXT)
HuL1L <- HuL1L %>% filter(CONTEXT %in% Hul1R$CONTEXT)
Full_HuL <- bind_rows(HuL1R, HuL1L) %>% arrange(CONTEXT, EXPANSION)

仍然给我只有 HuL1 L 或 HuL1 R 的 CONTEXTs

                      CEMETERY CONTEXT    SEX EXPANSION VALUE
1       Medieval-Spital Square      19 FEMALE    HuL1 L   326
2       Medieval-Spital Square      19 FEMALE    HuL1 R   332
3       Medieval-Spital Square      22   MALE    HuL1 L   330
4       Medieval-Spital Square      22   MALE    HuL1 R   338
5       Medieval-Spital Square      31   MALE    HuL1 L   328
6       Medieval-Spital Square      31   MALE    HuL1 R   328
7  Medieval-St. Benet sherehog      67   MALE    HuL1 R   339
8       Medieval-Spital Square      95   MALE    HuL1 L   316
9       Medieval-Spital Square      95   MALE    HuL1 R   316
10      Medieval-Spital Square     269   MALE    HuL1 R   339
11      Medieval-Spital Square     287 FEMALE    HuL1 R   282

【问题讨论】:

    标签: r dataframe boolean-operations


    【解决方案1】:

    您可以决定 rbind 您的数据框并在基础 r 中使用 duplicated 函数

     dat3=rbind(dat1,dat2)
     dat4=dat3[duplicated(dat3$CONTEXT,fromLast = T)|duplicated(dat3$CONTEXT),]
     dat4[order(dat4$CONTEXT),]
                             CEMETERY CONTEXT    SEX EXPANSION VALUE
     1  613  Medieval-St. Mary Graces    7172 FEMALE    HuL1 L   285
     9  612  Medieval-St. Mary Graces    7172 FEMALE    HuL1 R   286
     5  1441 Medieval-St. Mary Graces    8117 FEMALE    HuL1 L   316
     13 1440 Medieval-St. Mary Graces    8117 FEMALE    HuL1 R   320
     6  1575 Medieval-St. Mary Graces    8207   MALE    HuL1 L   326
     14 1574 Medieval-St. Mary Graces    8207   MALE    HuL1 R   326
    

    使用管道:

     rbind(dat1,dat2)%>%{.[duplicated(.[2])|duplicated(.[2],fromLast = T),]}%>%{.[order(.[2]),]}
    

    使用的数据:

     structure(list(CEMETERY = c("613  Medieval-St. Mary Graces", 
     "681  Medieval-St. Mary Graces", "860  Medieval-St. Mary Graces", 
     "1301 Medieval-St. Mary Graces", "1441 Medieval-St. Mary Graces", 
     "1575 Medieval-St. Mary Graces"), CONTEXT = c(7172L, 7223L, 7314L, 
      8102L, 8117L, 8207L), SEX = c("FEMALE", "MALE", "MALE", "MALE", 
     "FEMALE", "MALE"), EXPANSION = c("HuL1 L", "HuL1 L", "HuL1 L", 
     "HuL1 L", "HuL1 L", "HuL1 L"), VALUE = c(285L, 310L, 357L, 323L, 
     316L, 326L)), .Names = c("CEMETERY", "CONTEXT", "SEX", "EXPANSION", 
     "VALUE"), class = "data.frame", row.names = c(NA, -6L))
    
     dat2=structure(list(CEMETERY = c("166  Medieval-St. Mary Graces", 
     "345  Medieval-St. Mary Graces", "612  Medieval-St. Mary Graces", 
     "660  Medieval-St. Mary Graces", "1214 Medieval-St. Mary Graces", 
     "1348 Medieval-St. Mary Graces", "1440 Medieval-St. Mary Graces", 
     "1574 Medieval-St. Mary Graces"), CONTEXT = c(6225L, 6351L, 7172L, 
     7202L, 8016L, 8111L, 8117L, 8207L), SEX = c("MALE", "MALE", "FEMALE", 
     "MALE", "MALE", "FEMALE", "FEMALE", "MALE"), EXPANSION = c("HuL1 R", 
     "HuL1 R", "HuL1 R", "HuL1 R", "HuL1 R", "HuL1 R", "HuL1 R", "HuL1 R"
     ), VALUE = c(346L, 330L, 286L, 340L, 334L, 308L, 320L, 326L)), .Names = c  ("CEMETERY", 
     "CONTEXT", "SEX", "EXPANSION", "VALUE"), class = "data.frame", row.names = c(NA, 
     -8L))
    

    【讨论】:

    • 我知道你不应该只是评论谢谢,但我不得不说非常感谢,我已经为此苦苦挣扎了好几天,它终于奏效了!您是否还知道如何创建一个仅包含右侧测量值的新列,而另一列仅包含左侧测量值?
    • 当然只是重塑您的数据!
    • 如果这个问题确实回答了你的问题,那么请接受它来关闭这个问题。你也可以投票赞成这个问题
    • 您可以使用tidyr::spread(data, key=EXPANSION, value=CONTEXT) 来重塑您的数据,或者您可以使用reshape2::dcast(data,CONTEXT~EXPANSION) 以及更多功能,您甚至可以在基础R 中使用reshape()
    【解决方案2】:

    您可以尝试的一件事是group_byfilter 来自dplyr

    library(dplyr)
    library(tidyr)
    
    Full <- bind_rows(HuL1L, HuL1R) %>%
        group_by(CONTEXT) %>%
        filter(any(EXPANSION == "HuL1 L"),
               any(EXPANSION == "HuL1 R")) %>%
        arrange(CONTEXT, EXPANSION) %>%
        spread(EXPANSION, VALUE)
    

    如果要reshape,可以使用library(tidyr)

    【讨论】:

    • 我尝试了所有这 3 个,但我的数据框仍然包含只有 1 个骨骼测量的标本,我不知道为什么会这样
    • 您可以在两个数据帧上使用dput(head(50)) 并发布输出,以便我可以使用它吗?我需要一些可重现的代码来查看您的错误是什么。
    • 我现在已经放入了我的整个数据框
    • 我刚刚尝试了我的第一个答案,它奏效了。所以我不确定为什么它不适合您,请使用group_by 尝试我的最后一个答案,或者您是否可以发布更多详细信息,说明它为什么不起作用,或者您是否还有其他事情要做.我的第一个答案确实产生了您的预期输出,同样。
    猜你喜欢
    • 2016-11-19
    • 2018-04-12
    • 2020-02-26
    • 1970-01-01
    • 2017-01-14
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多