【问题标题】:Summarising with data.table and conserving factor order用 data.table 总结并保存因子顺序
【发布时间】:2018-08-03 20:06:11
【问题描述】:

我正在使用data.table 准备要打印的表格。

我经常使用因子来获得我想要的排序,但不知道我是否对data.table 做错了。

library(data.table)
DT <- as.data.table(iris)
DT[, Species := relevel(Species, ref = "virginica")]

# Factor levels ordered as I want them
DT[, levels(Species)]
#> [1] "virginica"  "setosa"     "versicolor"

# Table and dplyr bases its order on that
table(DT[, Species])
#> 
#>  virginica     setosa versicolor 
#>         50         50         50
suppressMessages (library(dplyr));count(DT, Species)
#> # A tibble: 3 x 2
#>   Species    `n()`
#>   <fct>      <int>
#> 1 virginica     50
#> 2 setosa        50
#> 3 versicolor    50

# data.table aggregation just cares about order of appearance?
DT[, .N, Species]
#>       Species  N
#> 1:     setosa 50
#> 2: versicolor 50
#> 3:  virginica 50

一种解决方案是使用匹配,但有点冗长。

DT[, .N, Species][match(levels(Species), Species)]
#>       Species  N
#> 1:  virginica 50
#> 2:     setosa 50
#> 3: versicolor 50

【问题讨论】:

    标签: r data.table


    【解决方案1】:

    如果您想通过by 变量订购,只需使用keyby

    DT[, .N, keyby = Species]
    #      Species  N
    #1:  virginica 50
    #2:     setosa 50
    #3: versicolor 50
    

    【讨论】:

    • 爱上data.table
    猜你喜欢
    • 1970-01-01
    • 2019-12-03
    • 1970-01-01
    • 2014-02-11
    • 1970-01-01
    • 1970-01-01
    • 2013-07-10
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多