【问题标题】:Sort a vector by a substring [duplicate]按子字符串对向量进行排序[重复]
【发布时间】:2021-03-15 21:52:56
【问题描述】:

您好,我有一个文件列表:

    "1_EX-P1-H2.3000"    "10_EX-P1-H2.3002"   "100_EX-P1-H2.3074" 
    "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000" 
    "5_EX-P1-H2.3001"

我想不是按字典顺序排序,而是购买“_”之前的第一个数字的顺序,这些数字从 1 到 1000。所以结果我应该得到:

    "1_EX-P1-H2.3000"    "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000" 
    "5_EX-P1-H2.3001"    "10_EX-P1-H2.3002"   "100_EX-P1-H2.3074" 
    "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070" 

【问题讨论】:

标签: r arrays string sorting substring


【解决方案1】:

正如order 提到的基于_ 之前的第一个数字的OP,我们可以使用readr 中的parse_number 来提取第一个数字子字符串order 并使用它来重新排列向量

v1[order(readr::parse_number(v1))]
#[1] "1_EX-P1-H2.3000"    "2_EX-P1-H2.3000"    "3_EX-P1-H2.3000"    "4_EX-P1-H2.3000"    "5_EX-P1-H2.3001"    "10_EX-P1-H2.3002"  
#[7] "100_EX-P1-H2.3074"  "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"

或者使用sub删除子字符串order

v1[order(as.numeric(sub("_.*", "", v1)))]
#[1] "1_EX-P1-H2.3000"    "2_EX-P1-H2.3000"    "3_EX-P1-H2.3000"    "4_EX-P1-H2.3000"    "5_EX-P1-H2.3001"    "10_EX-P1-H2.3002"  
#[7] "100_EX-P1-H2.3074"  "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"

或者另一个选项是mixedsort from gtools

gtools::mixedsort(v1)

-输出

#[1] "1_EX-P1-H2.3000"    "2_EX-P1-H2.3000"    "3_EX-P1-H2.3000"    "4_EX-P1-H2.3000"    "5_EX-P1-H2.3001"    "10_EX-P1-H2.3002"  
#[7] "100_EX-P1-H2.3074"  "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"

数据

v1 <- c("1_EX-P1-H2.3000", "10_EX-P1-H2.3002", "100_EX-P1-H2.3074", 
"1004_EX-P1-H2.4059", "1006_EX-P1-H2.4070", "2_EX-P1-H2.3000", 
"3_EX-P1-H2.3000", "4_EX-P1-H2.3000", "5_EX-P1-H2.3001")

【讨论】:

    猜你喜欢
    • 2013-04-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-10-12
    • 2018-10-08
    • 1970-01-01
    • 2019-03-31
    • 2014-09-19
    相关资源
    最近更新 更多