一个特定目录中的 .csv 文件列表答案

【问题标题】：list of .csv files in one specific directory一个特定目录中的 .csv 文件列表
【发布时间】：2016-07-17 14:35:16
【问题描述】：

我在一个目录中有.csv 文件（比如说C:/Dowloads）。我可以使用list.files("path") 读取该目录中的所有文件。但我无法使用for 循环读取指定数量的文件。也就是说，假设我有 332 个文件，我只想读取文件 1 到 10 或 5 到 10。

这是一个例子：

files <- list.files("path")
files ## displays all the files.

现在进行测试：

k <- files[1:10]
k
## here it displays the files from 1 to 10.

所以我使用for 循环保持相同的内容，因为我想一个一个地读取文件。

for(i in 1:length(k)){
  length(i) ## just tested the length 
}

但它提供NA 或Null 或1。

谁能解释我如何使用for 循环或任何其他方式读取指定的.csv 文件？

【问题讨论】：

i 只是一个数字，因此i 的长度应始终为1。如果你得到的不是这个（例如，NA），你需要发布一个reproducible example 让人们一起找出原因。如果您想在for 循环中读取文件，何不试试read.table(file=files[i])？

标签： r csv data-science

【解决方案1】：

list.files 返回一个 character 类的字符向量。字符向量是字符串（即字符）的向量。函数length 应用于字符向量files 或字符向量files[1:10] 中的一系列元素或字符向量files[i] 中的单个元素将返回该字符向量中的字符串数，即数字范围内的字符串，或 1，分别。请改用nchar 来获取字符向量的每个元素（每个字符串）的字符数。所以：

path.to.csv <- "/path/to/your/csv/files"
files<-list.files(path.to.csv)
print(files)  ## list all files in path

k<-files[1:10]
print(k)      ## list first 10 files in path

for(i in 1:length(k)) {  ## loop through the first 10 files
  print(k[i]) ## each file name
  print(nchar(k[i])) ## the number of characters in each file name
  df <- read.csv(paste0(path.to.csv,"/",k[i]))  ## read each as a csv file
  ## process each df in turn here
}

请注意，在调用read.csv 时，我们必须将paste 的“路径”指向文件名。

编辑：我想我添加这个作为替代：

path.to.csv <- "/path/to/your/csv/files"
files<-list.files(path.to.csv)

for(iFile in files) {  ## loop through the files
  print(iFile) ## each file name
  print(nchar(iFile)) ## the number of characters in each file name
  df <- read.csv(paste0(path.to.csv,"/",iFile))  ## read each as a csv file
  ## process each df in turn here
}

这里，for 循环位于files 的集合（向量）之上，因此iFile 是i-th 文件名。

希望这会有所帮助。

【讨论】：

df 将被每次迭代覆盖
@user20650：是的，确实如此。当他说“逐个读取文件”时，我假设他要依次处理每个文件。我编辑了帖子以反映这一点。答案并不意味着是“最终”，只是足以有希望回答他的问题。我不知道他想如何处理每个文件中的数据。

【解决方案2】：

要一次读取特定数量的文件，您可以对文件向量进行子集化。首先创建文件的向量，包括路径：

f = list.files("/dir/dir", full.names=T, pattern="csv")
# nb full.names returns the full path to each file

然后，将每个文件读入一个单独的列表项（在本例中为前 10 个）：

dl = lapply(f[1:10], read.csv)

最后，看看列表项 1：

head(dl[[1]])

【讨论】：

【解决方案3】：

很遗憾，没有可重现的示例可供使用。通常，当我必须做类似的任务时，我会这样做：

files <- list.files(pattern='*.csv') # this search all .csv files in current working directory 
for(i in 1:length(files){
    read.csv(files[i], stringsAsFactors=F)
}

您的代码不起作用，因为您正在测试索引的长度，而不是向量的长度。希望这会有所帮助

【讨论】：