【问题标题】:Sorting a dataframe by a regular expression in its names通过名称中的正则表达式对数据框进行排序
【发布时间】:2020-07-16 12:12:59
【问题描述】:

在以下data.frame中:

df <- data.frame(matrix(1,6,6))

names(df) <- rownames (df) <- c("ABC.1cm", "ABC.2cm", "ABC.3cm", "DEF.1cm", "DEF.2cm", "DEF.3cm" )

如何重新组合列和行,以便对“1cm”、“2cm”、“3cm”进行分组?

期望的输出:

names(df) <- rownames (df) <- c("ABC.1cm", "DEF.1cm","ABC.2cm","DEF.2cm", "ABC.3cm", "DEF.3cm" )
df
        ABC.1cm DEF.1cm ABC.2cm DEF.2cm ABC.3cm DEF.3cm
ABC.1cm       1       1       1       1       1       1
DEF.1cm       1       1       1       1       1       1
ABC.2cm       1       1       1       1       1       1
DEF.2cm       1       1       1       1       1       1
ABC.3cm       1       1       1       1       1       1
DEF.3cm       1       1       1       1       1       1

注意:“._cm”实际上是存在的,但前缀有所不同。还有不止三个“cm”值(从1cm到29cm,所以数字长度可能会有所不同),并且它们以一式三份出现,而不是一式两份。

【问题讨论】:

    标签: r regex dataframe


    【解决方案1】:

    按最后一个 . 之后的内容排序。

    correct_ord <- names(df)[order(sub(".+\\.", "", names(df)))]
    
    df[correct_ord,correct_ord]
    
            ABC.1cm DEF.1cm ABC.2cm DEF.2cm ABC.3cm DEF.3cm
    ABC.1cm       1       1       1       1       1       1
    DEF.1cm       1       1       1       1       1       1
    ABC.2cm       1       1       1       1       1       1
    DEF.2cm       1       1       1       1       1       1
    ABC.3cm       1       1       1       1       1       1
    DEF.3cm       1       1       1       1       1       1
    

    【讨论】:

      【解决方案2】:

      您可以匹配names 中的数字,去除names 的其余部分,通过反向引用\\1 调用该数字,然后order 相应地names

      names(df)[order(sub(".*(\\d+).*", "\\1", names(df)))]
      [1] "ABC.1cm" "DEF.1cm" "ABC.2cm" "DEF.2cm" "ABC.3cm" "DEF.3cm"
      

      或者,您可以使用str_extract

      library(stringr)
      names(df)[order(str_extract(names(df), "\\d+"))]
      

      【讨论】:

        猜你喜欢
        • 2019-11-15
        • 1970-01-01
        • 2014-09-11
        • 1970-01-01
        • 1970-01-01
        • 2022-08-16
        • 2021-09-20
        • 2023-04-09
        • 2019-08-31
        相关资源
        最近更新 更多