【问题标题】:cannot convert character into numeric in R无法在R中将字符转换为数字
【发布时间】:2020-12-16 16:49:07
【问题描述】:

我从以下网站“地下天气”复制并粘贴了天气信息进行一些数据分析,数据如下所示:

https://www.wunderground.com/dashboard/pws/KCACHINO13/table/2018-04-10/2018-04-10/daily

如您所见,温度和其他信息都带有文本,因此我无法进行任何计算。在excel中,我使用了substitute(xx,"F","")从“温度”列中删除了F,但后来我想使用convert(xx,"F","C")将华氏转换为摄氏度,我无法得到结果。我认为数据本身有问题。我将单元格格式化为数字或将值复制并粘贴到新列,但它们都不起作用。

然后我将 data.frame 导入 R 并尝试使用 R 进行一些数据格式化。我检查了温度列的类,它显示“字符”:

class(a$Temperature)
#"character"

a$Temperature <- gsub("F","",a$Temperature)
# this command remmoved "F"

as.numeric(a$Temperature)
#Warning message: NAs introduced by coercion 

as.numeric(unlist(a$Temperature))
#still the same warning message

从 excel 中,我创建了从温度中删除“F”的新列,并在 R 中使用它来将“字符”转换为“数字”,但我仍然收到警告消息。我不知道如何处理这个问题。有人可以帮我吗?谢谢!

按照下面的建议,我正在粘贴来自

的输出
dput(head(a))

#structure(list(Time = structure(c(-2209075140, -2209074840, -2209074540, 
-2209074240, -2209073940, -2209073640), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), Temperature = c("60.0 ", "59.9 ", "59.8 ", "59.7 ", 
"59.6 ", "59.5 "), `T(F)` = c("60.0 ", "59.9 ", "59.8 ", "59.7 ", 
"59.6 ", "59.5 "), `Dew Point` = c("48.2 F", "48.1 F", "48.4 F", 
"48.3 F", "48.2 F", "48.1 F"), Humidity = c("65 %", "65 %", "66 %", 
"66 %", "66 %", "66 %"), Wind = c("WSW", "WSW", "WSW", "WSW", 
"WSW", "WSW"), Speed = c("0.0 mph", "0.0 mph", "0.0 mph", "0.0 mph", 
"0.0 mph", "0.0 mph"), Gust = c("0.0 mph", "0.0 mph", "0.0 mph", 
"0.0 mph", "0.0 mph", "0.0 mph"), Pressure = c("29.88 in", "29.88 in", 
"29.88 in", "29.88 in", "29.88 in", "29.88 in"), `Precip. Rate.` = c("0.00 in", 
"0.00 in", "0.00 in", "0.00 in", "0.00 in", "0.00 in"), `Precip. Accum.` = c("0.00 in", 
"0.00 in", "0.00 in", "0.00 in", "0.00 in", "0.00 in"), UV = c(0, 
0, 0, 0, 0, 0), Solar = c("0 w/m²", "0 w/m²", "0 w/m²", "0 w/m²", 
"0 w/m²", "0 w/m²")), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

【问题讨论】:

  • 什么是$Temperature 打印?我怀疑它是度数符号或数字之间的空格
  • 你可以使用dput(head(your_data_object _here)),运行它并复制结果并将其粘贴到你的帖子中吗?
  • a[-c(1,5)] &lt;- lapply(a[-c(1,5)], function(x) as.numeric(gsub("[^\\.[:digit:]]", "", x))).
  • @NotThatKindODr 你是对的,有一个度数符号,但它没有显示在 excel 或我的 r data.frame“a”中。我想这就是问题所在。
  • @ThoVu 我将输出粘贴到帖子中:)

标签: r excel character numeric dataformat


【解决方案1】:

如果您只想转换温度列,可以考虑使用以下选项。

数据

df <- structure(list(Time = c("12:04 AM", "12:09 AM", "12:14 AM", "12:19 AM", 
"12:24 AM", "12:29 AM"), Temperature = c("69.4 F", "69.2 F", 
"68.8 F", "68.5 F", "68.3 F", "68.0 F"), Dew.Point = c("45.9 F", 
"46.0 F", "45.8 F", "45.7 F", "45.7 F", "45.7 F"), Humidity = c("43 %", 
"43 %", "44 %", "44 %", "44 %", "45 %"), Wind = c("NE", "NE", 
"NE", "NE", "NE", "NE"), Speed = c("0.0 mph", "0.0 mph", "0.0 mph", 
"0.0 mph", "0.0 mph", "0.0 mph"), Gust = c("0.0 mph", "0.0 mph", 
"0.0 mph", "0.0 mph", "0.0 mph", "0.0 mph"), Pressure = c("29.93 in", 
"29.94 in", "29.94 in", "29.95 in", "29.95 in", "29.95 in"), 
    Precip..Rate. = c("0.00 in", "0.00 in", "0.00 in", "0.00 in", 
    "0.00 in", "0.00 in"), Precip..Accum. = c("0.00 in", "0.00 in", 
    "0.00 in", "0.00 in", "0.00 in", "0.00 in"), UV = c(0L, 0L, 
    0L, 0L, 0L, 0L), Solar = c("0 w/m²", "0 w/m²", "0 w/m²", 
    "0 w/m²", "0 w/m²", "0 w/m²")), class = "data.frame", row.names = c(NA, 
-6L))

代码

library(dplyr)
library(stringr)
df2 <- df %>% 
  mutate(Temperature2 = as.numeric(str_extract(Temperature, "[\\d\\.]+"))) %>% 
  relocate(Temperature2, .after = Temperature)

df2[, 2:3]
#   Temperature Temperature2
# 1      69.4 F         69.4
# 2      69.2 F         69.2
# 3      68.8 F         68.8
# 4      68.5 F         68.5
# 5      68.3 F         68.3
# 6      68.0 F         68.0
str(df2$Temperature2)
# num [1:6] 69.4 69.2 68.8 68.5 68.3 68

【讨论】:

  • relocate 是个不错的功能,不知道有没有,谢谢分享
  • 谢谢!请问一下relocate功能?我收到了 R 消息:重定位错误(.,Temperature2,.after = Temperature):找不到函数“重定位”似乎“重定位”是在线 R 中的一个基本功能,但我找不到它我的...
  • 适用于 dplyr > 1.0.0 版本。如果你更新到最新的,那么你去。如果您不喜欢,可以随意删除它。这里我只是为了让你更容易看到结果。
  • 成功了!我将此代码应用于我的 data.frame,它适用于所有列。谢谢!我还有一个问题:我试图查找 [\\d\\.]+,但我不太明白。 \d 表示任何数字,\.表示句号,+ 表示一次或多次重复。第一个 \ 是什么意思,整个代码加起来是什么意思?
  • 因为 d 和 .是特殊字符,我们需要在 R 中使用 \\ 对其进行转义。代码意味着我们将匹配所有数字和点。我们没有得到任何其他东西。然后我们就可以转换成数字了。
【解决方案2】:

也许这会有所帮助。 在这个函数中嵌套了几个不同的函数,例如从字符变量更改为数字。还有 gsub,它将逗号更改为空格。您应该将逗号更改为您要更改的字母。从未尝试过它是否适用于字母,但这可能是一种解决方案。代码如下:

data666

对整个数据集应用函数。 2 表示它逐列执行。如果要按 ro 逐行更改,则必须将 2 更改为 1。

【讨论】:

    猜你喜欢
    • 2022-01-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-12-09
    • 1970-01-01
    • 2014-09-15
    • 1970-01-01
    相关资源
    最近更新 更多