【问题标题】:How to perform as.factor function?如何执行 as.factor 功能?
【发布时间】:2019-06-14 18:01:24
【问题描述】:

我有多个数据框(即事故、车辆和伤亡),它们将作为事故合并到一个数据框中。我如何找到组合数据框的因素,即如何找到事故因素?

$ accident_severity           : char  "Serious" "Slight" "Slight" "Slight" ...
$ number_of_vehicles          : int  1 1 2 2 1 1 2 2 2 2 ...
$ number_of_casualties        : int  1 1 1 1 1 1 1 1 1 1 ...
$ date                        : char  "04/01/2005" "05/01/2005" "06/01/2005" "06/01/2005" ...
$ day_of_week                 : char  "Tuesday" "Wednesday" "Thursday" "Thursday" ...
$ time                        : char  "17:42" "17:36" "00:15" "00:15" ...

【问题讨论】:

标签: r dataframe


【解决方案1】:

您可以使用lapply 函数将选择的列从character 转换为factor。列accident_severityday_of_week的转换见下面的代码:

df <- data.frame(accident_severity= c("Serious", "Slight", "Slight", "Slight"),
                 number_of_vehicles =  c(1, 1, 2, 2),
                 number_of_casualties =  c(1,  1,  1,  1),
                 date =  c("04/01/2005", "05/01/2005", "06/01/2005", "06/01/2005"),
                 day_of_week =  c("Tuesday", "Wednesday", "Thursday", "Thursday"),
                 time = c("17:42", "17:36", "00:15", "00:15"),
                 stringsAsFactors = FALSE)
str(df)
# 'data.frame': 4 obs. of  6 variables:
#   $ accident_severity   : Factor w/ 2 levels "Serious","Slight": 1 2 2 2
# $ number_of_vehicles  : num  1 1 2 2
# $ number_of_casualties: num  1 1 1 1
# $ date                : chr  "04/01/2005" "05/01/2005" "06/01/2005" "06/01/2005"
# $ day_of_week         : Factor w/ 3 levels "Thursday","Tuesday",..: 2 3 1 1
# $ time                : chr  "17:42" "17:36" "00:15" "00:15"

df[c("accident_severity", "day_of_week")] <- lapply(df[c("accident_severity", "day_of_week")], factor)
str(df)
# 'data.frame': 4 obs. of  6 variables:
#   $ accident_severity   : Factor w/ 2 levels "Serious","Slight": 1 2 2 2
# $ number_of_vehicles  : num  1 1 2 2
# $ number_of_casualties: num  1 1 1 1
# $ date                : chr  "04/01/2005" "05/01/2005" "06/01/2005" "06/01/2005"
# $ day_of_week         : Factor w/ 3 levels "Thursday","Tuesday",..: 2 3 1 1
# $ time                : chr  "17:42" "17:36" "00:15" "00:15"

要查找列名是否为因素,您可以使用is.factor 函数:

names(df)[unlist(lapply(df, is.factor))]
# [1] "accident_severity" "day_of_week"   

【讨论】:

    猜你喜欢
    • 2019-05-26
    • 2020-11-07
    • 1970-01-01
    • 1970-01-01
    • 2014-05-03
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-01-19
    相关资源
    最近更新 更多