【发布时间】:2020-01-27 23:24:23
【问题描述】:
以下代码运行一个非常简单的lm(),并尝试在一个小数据框中总结结果(因子水平、系数):
df <- data.frame(star_sign = c("Aries", "Taurus", "Gemini", "Cancer", "Leo", "Virgo", "Libra", "Scorpio", "Sagittarius", "Capricorn", "Aquarius", "Pisces"),
y = c(1.1, 1.2, 1.4, 1.3, 1.8, 1.6, 1.4, 1.3, 1.2, 1.1, 1.5, 1.3))
levels(df$star_sign) #alphabetical order
# fit a simple linear model
my_lm <- lm(y ~ 1 + star_sign, data = df)
summary(my_lm) # intercept is based on first level of factor, aquarius
# I want the levels to work properly 1..12 = Aries, Taurus...Pisces so I'm going to redefine the factor levels
df$my_levels <- c("Aries", "Taurus", "Gemini", "Cancer", "Leo", "Virgo", "Libra", "Scorpio", "Sagittarius", "Capricorn", "Aquarius", "Pisces")
df$star_sign <- factor(df$star_sign, levels = df$my_levels)
my_lm <- lm(y ~ 1 + star_sign_, data = df)
summary(my_lm) # intercept is based on first level of factor which is now Aries
# but for my model fit I want the reference level to be Virgo (because reasons)
df$star_sign_2 <- relevel(df$star_sign, ref = "Virgo")
my_lm <- lm(y ~ 1 + star_sign_2, data = df)
summary(my_lm)
df_results <- data.frame(factor_level = names(my_lm$coefficients), coeff = my_lm$coefficients )
# tidy up
rownames(df_results) <- 1:12
df_results$factor_level <- as.factor(gsub("star_sign_2", "", df_results$factor_level))
# change label of "(Intercept)" to "Virgo"
df_results$factor_level <- plyr::revalue(df_results$factor_level, c("(Intercept)" = "Virgo"))
levels(df_results$factor_level) # the levels are alphabetical + Virgo at the front (not same as display order from lm)
因子水平的顺序不正确:我想对df_results 进行排序,以便星号以与它们最初(白羊座、金牛座...双鱼座)相同的顺序出现,如@ 987654324@专栏。我认为我对操纵因素及其标签/级别等没有很好的了解,所以我很难知道如何做到这一点。
这也是一段冗长而笨拙的代码。有没有更简洁的方法来做这种事情?
谢谢。
(ps 从数学上讲,模型显然是微不足道的,但对于这些目的来说没关系 - 我只是对如何操作输出感兴趣)
【问题讨论】: