【问题标题】:Conserve Unique Rows in R保存 R 中的唯一行
【发布时间】:2020-05-27 08:06:21
【问题描述】:

我想保留两个给定列中存在的数据框中具有相同元素的行,例如

df <- data.frame(BGC1 = c("BGC1", "BGC1", "BGC1", "BGC2", "BGC2", "BGC2", "BGC3", "BGC3", "BGC3", "BGC4", "BGC4", "BGC4"),
                                     BGC2 = c("BGC2", "BGC3", "BGC4", "BGC1", "BGC3", "BGC4", "BGC1", "BGC2", "BGC4", "BGC1", "BGC2", "BGC3"),
                                     Family1 = c("Strepto_10","Strepto_20","Strepto_30", "Strepto_20","Strepto_20", "Strepto_50", "Strepto_20", "Strepto_30", "Strepto_30", "Strepto_30", "Strepto_50", "Strepto_40")
                                   , Family2 = c("Strepto_10","Strepto_10","Strepto_10", "Strepto_20","Strepto_20", "Strepto_20", "Strepto_30", "Strepto_30", "Strepto_30", "Strepto_40", "Strepto_40", "Strepto_40"))

示例 DF

BGC1  | BGC2  | Bacteria1    |   Bacteria2
BGC1    BGC2    Strepto_10       Strepto_10
BGC1    BGC3    Strepto_20       Strepto_10
BGC1    BGC4    Strepto_30       Strepto_10
BGC2    BGC1    Strepto_20       Strepto_20
BGC2    BGC3    Strepto_20       Strepto_20
BGC2    BGC4    Strepto_50       Strepto_20
BGC3    BGC1    Strepto_20       Strepto_30
BGC3    BGC2    Strepto_30       Strepto_30
BGC3    BGC4    Strepto_30       Strepto_30
BGC4    BGC1    Strepto_30       Strepto_40
BGC4    BGC2    Strepto_50       Strepto_40
BGC4    BGC3    Strepto_40       Strepto_40

例如,我想保留 Family1 和 Family2 相同的那些

预期输出

BGC1  | BGC2  | Bacteria1    |   Bacteria2
BGC1    BGC2    Strepto_10       Strepto_10
BGC2    BGC1    Strepto_20       Strepto_20
BGC2    BGC3    Strepto_20       Strepto_20
BGC3    BGC2    Strepto_30       Strepto_30
BGC3    BGC4    Strepto_30       Strepto_30
BGC4    BGC3    Strepto_40       Strepto_40

【问题讨论】:

  • 请不要发布代码/数据/错误的图像:它不能被复制或搜索 (SEO),它会破坏屏幕阅读器,并且它可能不适合某些移动设备。参考:meta.stackoverflow.com/a/285557(和xkcd.com/2116)。请直接包含代码、控制台输出或数据(例如,dput(head(x))data.frame(...))。
  • mydata[ mydata$col0 %in% c(dat1$col1, dat2$col2),]
  • 我真的不知道,你的图像是模糊的,不是R,而且......图像。我不会花时间尝试将图像转录为您计算机上的实际数据。
  • 这里有一些很好的提示,说明问题中应包含哪些内容以使其可重现并易于其他人“玩”:stackoverflow.com/q/5963269minimal reproducible examplestackoverflow.com/tags/r/info
  • 您不必删除问题,只需删除图像并添加可用数据即可。

标签: r dataframe merge dplyr


【解决方案1】:

您可以使用[ 设置子集,其中df$Family1 == df$Family2

df[df$Family1 == df$Family2,]
#   BGC1 BGC2    Family1    Family2
#1  BGC1 BGC2 Strepto_10 Strepto_10
#4  BGC2 BGC1 Strepto_20 Strepto_20
#5  BGC2 BGC3 Strepto_20 Strepto_20
#8  BGC3 BGC2 Strepto_30 Strepto_30
#9  BGC3 BGC4 Strepto_30 Strepto_30
#12 BGC4 BGC3 Strepto_40 Strepto_40

【讨论】:

    【解决方案2】:

    你可以subset Bacteria1Bacteria2 相等。

    subset(df, Bacteria1 == Bacteria2)
    
    #   BGC1 BGC2  Bacteria1  Bacteria2
    #1  BGC1 BGC2 Strepto_10 Strepto_10
    #4  BGC2 BGC1 Strepto_20 Strepto_20
    #5  BGC2 BGC3 Strepto_20 Strepto_20
    #8  BGC3 BGC2 Strepto_30 Strepto_30
    #9  BGC3 BGC4 Strepto_30 Strepto_30
    #12 BGC4 BGC3 Strepto_40 Strepto_40
    

    使用dplyrfilter

    dplyr::filter(df, Bacteria1 == Bacteria2)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-03-12
      • 2018-01-15
      • 2018-11-21
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-11-24
      • 1970-01-01
      相关资源
      最近更新 更多