使用 data.table 的类似答案:
> library(data.table)
> df <- data.frame(Gender = c("F", "M", "F", "M", "M", "M", "M", "F", "M", "M"),
+ Young = c("Y", "N", "Y", "N", "Y", "N", "Y", "N", "Y", "N"),
+ Age = c("14", "25", "13", "24", "14", "25", "13", "24",
+ "10", "26"),
+ Location = c("Suburb", "Rural", "Suburb",
+ "Rural","Suburb", "Rural","Suburb",
+ "Rural","Suburb", "Rural"))
> setDT(df) # make it a data.table
> df[,Age:=as.integer(Age)] # correct age column
> df[,.(mean=mean(Age), median=median(Age), max=max(Age), min=min(Age)),
+ by=.(Gender,Location)]
Gender Location mean median max min
1: F Suburb 13.5000 13.5 14 13
2: M Rural 25.0000 25.0 26 24
3: M Suburb 12.3333 13.0 14 10
4: F Rural 24.0000 24.0 24 24
>
或者如果我们想一次按一个变量分层:
> df[,.(mean=mean(Age), median=median(Age), max=max(Age),min=min(Age)),
+ by=.(Gender)]
Gender mean median max min
1: F 17.0000 14 24 13
2: M 19.5714 24 26 10
> df[,.(mean=mean(Age), median=median(Age), max=max(Age), min=min(Age)),
+ by=.(Location)]
Location mean median max min
1: Suburb 12.8 13 14 10
2: Rural 24.8 25 26 24
>
并受到 Ronak 的好回答的启发,与 data.table 单线一样:
> melt(df, id.vars="Age")[, .(mean=mean(Age),
+ median=median(Age),
+ min=min(Age),
+ max=max(Age)), by=.(variable,value)]
variable value mean median min max
1: Gender F 17.0000 14 13 24
2: Gender M 19.5714 24 10 26
3: Young Y 12.8000 13 10 14
4: Young N 24.8000 25 24 26
5: Location Suburb 12.8000 13 10 14
6: Location Rural 24.8000 25 24 26
>