read_token 中的错误...与带有 read_delim 的 STRSXP 不兼容答案

【问题标题】：Error in read_token ... not compatible with STRSXP with read_delimread_token 中的错误...与带有 read_delim 的 STRSXP 不兼容
【发布时间】：2017-10-31 16:34:09
【问题描述】：

我正在尝试使用readr 的read_delim 导入以下文本文件（直接复制并粘贴到此处）：

sht_name    lon lat country
AD  42,546245   1,601554    Andorra
AE  23,424076   53,847818   United Arab Emirates
AF  33,93911    67,709953   Afghanistan
AG  17,060816   -61,796428  Antigua and Barbuda
AI  18,220554   -63,068615  Anguilla
AL  41,153332   20,168331   Albania
AM  40,069099   45,038189   Armenia
AN  12,226079   -69,060087  Netherlands Antilles

这是我的代码：

library(readr)
loc <- locale(decimal_mark = ",")
country_coordinates <- read_delim(file = 'list.txt', delim = '\t', col_names = TRUE,
                                  col_types = cols(sht_name = col_character(),
                                      lon = col_number(),
                                      lat = col_number(),
                                      country = col_character()),
                                  locale = loc)

这是我的错误：

Error in read_tokens_(data, tokenizer, col_specs, col_names, locale_,  : 
  not compatible with STRSXP
In addition: Warning messages:
1: Duplicated column names deduplicated: '' => '_1' [3], '' => '_2' [4] 
2: The following named parsers don't match the column names: sht_name, lon, lat, country

我已经为此苦苦挣扎了太久，有人可以告诉我我做错了什么吗？

编辑：

顺便说一句，如果我以 csv 格式导入信息，使用以下（非常相似的）代码我没有问题：

country_coordinates <- read_csv2(file = 'list.csv', col_names = TRUE,
                                  col_types = cols(sht_name = col_character(),
                                                   lon = col_number(),
                                                   lat = col_number(),
                                                   country = col_character()),
                                  locale = loc)

【问题讨论】：

read_csv2() 使用";" 作为分隔符，而不是","
@EnriquePérezHerrero ";"是分隔符，","是小数点。

标签： r parsing readr

【解决方案1】：

我可能对你有一些见解。

似乎 type_convert 只接受字符列作为输入。我猜你的输入列不是类字符。

所以，在下面的代码中：

library(tidyverse)  

bla   <- c('bill','bob','bill')
bling <- c(1,2,3)
bloop <- c("2015-05-05 13:23:00","2015-02-07 21:22:14","2015-01-01 17:30:15")
df <- tibble(bla, 
             bling, 
             bloop
             )
df_coltypes <- cols('bla'   = readr::col_factor(NULL) ,
                    'bling' = readr::col_integer()    ,
                    'bloop' = readr::col_datetime(format="%Y-%m-%d %H:%M:%S")    
                    )
df2 <- type_convert(df,trim_ws=TRUE,col_types = df_coltypes )

我明白了：

警告消息：以下命名解析器与列名不匹配：bling

但如果我这样做：

df$bling <- as.character(df$bling)
df3 <- type_convert(df,trim_ws=TRUE,col_types = df_coltypes )

然后我的所有列都会转换。

希望对您有所帮助。

【讨论】：