【问题标题】:Why are my rows disappearing in my new variable?为什么我的行在我的新变量中消失了?
【发布时间】:2021-05-10 01:37:19
【问题描述】:

我正在尝试从两个现有列创建一个新列。但是,当我运行此代码时,R 希望适合我的变量,以便它们与现有行数相匹配。换句话说,R 从我首先分配的变量中删除了一些行,以便为第二个变量的行腾出空间。有没有更简单的方法来组合列以便包含所有行?

在创建新变量之前:

> summary(tdata)
 rumpColor   survivalWithFalcon
 blue :102   killed  :101      
 white:101   survived:102  

制作新变量:

tdata$newvar[tdata$survivalWithFalcon==c("killed")] <- "k"
tdata$newvar[tdata$survivalWithFalcon==c("survived")] <- "s"
tdata$newvar[tdata$rumpColor==c("blue")] <- "b"
tdata$newvar[tdata$rumpColor==c("white")] <- "w"
tdata$newvar<-as.factor(tdata$newvar)

新建变量后:

> summary(tdata)
 rumpColor   survivalWithFalcon newvar 
 blue :102   killed  :101       b:102  
 white:101   survived:102       w:101  

但我希望“newvar”拥有:

newvar
k:101
s:102
b:102
w:101

【问题讨论】:

标签: r variables categorical-data


【解决方案1】:

原因是列是factor 类,因为summary 只返回factor 类的计数,而不是character。当我们将新级别分配给 factor 列而不存在级别时,它会创建一些 NA。为了防止这种情况,我们可以这样做

tdata$newvar <- tdata$survivalWithFalcon
levels(tdata$newvar) <- c(levels(tdata$newvar), "k", "s", "b", "w")

【讨论】:

    【解决方案2】:

    data.frame 是一个列列表,要求每列具有相同的长度。如果不向其他列添加更多数据点,则无法添加新行。

    数据框

    说明:

     The function ‘data.frame()’ creates data frames, tightly coupled
     collections of variables which share many of the properties of
     matrices and of lists, used as the fundamental data structure by
     most of R's modeling software.
    

    用法:

     data.frame(..., row.names = NULL, check.rows = FALSE,
                check.names = TRUE, fix.empty.names = TRUE,
                stringsAsFactors = default.stringsAsFactors())
     
     default.stringsAsFactors()
      Arguments:
    
     ...: these arguments are of either the form ‘value’ or ‘tag =
          value’.  Component names are created based on the tag (if
          present) or the deparsed argument itself.
    

    您对以下 2 行所做的不是添加新行,而是替换分别匹配 rumpColor == BluerumpColor == White 的行。这匹配之前 survivalWithFalcon == killedsurvivalWithFalcon = survived 所在的所有行。

    tdata$newvar[tdata$rumpColor==c("blue")] <- "b"
    tdata$newvar[tdata$rumpColor==c("white")] <- "w"
    

    作为替代方案,我认为您希望实现的目标:

    table(tdata$rumpColor, tdata$survivalWithFalcon)
    

    【讨论】:

      猜你喜欢
      • 2016-02-15
      • 1970-01-01
      • 1970-01-01
      • 2014-12-06
      • 1970-01-01
      • 2017-03-23
      • 1970-01-01
      • 2014-08-23
      • 1970-01-01
      相关资源
      最近更新 更多