【发布时间】:2016-06-06 22:21:53
【问题描述】:
这可能很简单,但我似乎无法弄清楚。
我有一个 csv 文件,其中所有条目都用引号括起来,还有数值,比如这个,比如xy.csv:
"y","z"
"1.1","bla"
"2.1","blubb"
到目前为止,我一直在使用
阅读和重新声明这些文件dat <- read.table("yz.csv",colClasses=rep("character",2), header=TRUE)
dat$y <- as.numeric(dat$y)
现在随着数字列数的增加,如qz.csv
"q","r","s","t","u","v","w","x","y","z"
"1.1","1.2","1.3","1.4","1.5","1.6","1.7","1.8","1.9","bla"
"2.1","2.2","2.3","2.4","2.5","2.6","2.7","2.8","2.9","blubb"
我觉得是时候更专业地做到这一点,以防止以下情况发生
dat <- read.table("qz.csv",colClasses=rep("character",10), header=TRUE)
dat$q <- as.numeric(dat$a)
dat$r <- as.numeric(dat$b)
...
dat$y <- as.numeric(dat$y)
有没有办法让 read.table 函数忽略数字周围的引号,所以我可以使用
dat <- read.table("qz.csv",colClasses=c(rep("numeric",9),"character"), header=TRUE)
目前给我scan() expected 'a real', got '"1.2"'的错误?
编辑:这是原版file,这是我用于原版的代码,它给了我错误:
doc <- read.csv("testfile.csv", collClasses=c("character","character",rep("NULL",50),rep("numeric",7),"NULL","NULL"), col.names=c("country","code",rep("bla",50),"doc08","doc09","doc10","doc11","doc12","doc13","doc14","bla","bla"), skip=4, check.names=F, header=T)
【问题讨论】:
-
添加了一个示例,希望对您有所帮助。如果没有,我可以给一些原始文件。
-
能否在读取函数中添加
header=T?看来这是问题所在,因为您的第一行包含字符 -
当然,很明显它已经在我的原始代码中了。感谢您指出这一点。
标签: r csv quotes read.table