【发布时间】:2021-12-05 10:43:09
【问题描述】:
我正在处理比下面附加的更大的数据集,我需要再次编码 double 类型的列。我尝试在一个名为encoder 的函数中使用prettyNum,但它对我的数据的运行速度非常慢。这是我尝试过的方法;
library(data.table)
set.seed(1453)
sample_data <- data.frame(a=sample(1:1000,100,replace=T),
b=sample(1:1000,100,replace=T),
c=sample(seq(1,1000,0.01),100,replace=T),
d=sample(seq(1,1000,0.01),100,replace=T),
e=sample(seq(1,1000,0.01),100,replace=T),
f=sample(seq(1,1000,0.01),100,replace=T),
g=sample(seq(1,1000,0.01),100,replace=T),
h=sample(seq(1,1000,0.01),100,replace=T),
i=sample(LETTERS,1000,replace=T),
j=sample(letters,1000,replace=T))
setDT(sample_data)
options(warn=-1)
double_cols <- which(sapply(sample_data,is.double))
encoder <- function(x) prettyNum(x*1e4,big.mark = '.')
sample_data[,(double_cols):=lapply(.SD,encoder),.SDcols=double_cols]
它已经有效,但我相信有一种更快的解决方案,
提前致谢。
【问题讨论】:
-
这可以写成更短的方式:
sample_data[,(double_cols):=lapply(.SD,encoder),.SDcols=is.double],但这不会让它更快
标签: r types data.table