粘贴数据框中的一行以匹配另一个数据框的行长答案

【问题标题】：Paste a row from a dataframe to match the length of rows of another dataframe粘贴数据框中的一行以匹配另一个数据框的行长
【发布时间】：2018-03-28 19:45:01
【问题描述】：

我与 RStudio 合作已有几个月了，但我很难处理一件事。我有一个包含多个 csv 文件的目录，我需要在 RStudio 中导入这些文件。每个名称和日期有多个文件。

csv 文件的格式实际上很奇怪。 csv 中的所有数据（数字）从第 7 行开始。问题是收集的文件中的信息（名称、日期、设备等）需要单独提取。

基本上，for 循环中的 temp 数据帧都有不同的行数 (+200)。另一方面，info 数据帧都只有一行（每个 csv 一行）。

我想将这两个文件与 info 行绑定在一起，该行重复了相关数据 df_groinbar 的长度（在 csv 中）。不要忘记每个 csv 的长度（df_groinbar 数据帧）是不同的，因此需要针对每个 csv 调整 info 和 df_groinbar 的绑定。

df_groinbar <- data.frame()
info <- data.frame()
for (i in list.files("/Users/Nicolas/Dropbox/Groin Bar/"))
{
  type <- str_extract(i, "([A-Z]+)")
  temp <- read_csv(i, skip = 6, col_names = c("elapsed_time", "left_squeeze", "right_squeeze", "left_pull", "right_pull"))
  info_temp <- select(read_csv(i, skip = 2, n_max = 1), 1:6)
  df_groinbar <- rbind(df_groinbar, temp)
  info <- rbind(info, info_temp)
}

我已经尝试了 smartbind 功能以及更多，但没有任何效果。

非常感谢！

尼古拉斯

【问题讨论】：

没有看到实际数据就很难提供帮助。您可以将文件上传到共享网站
我可以和你分享一些 CSV 文件。共享网站是指 Dropbox/Google Drive？
没错。完成后添加问题的链接
这里是链接：
drive.google.com/open?id=1BBYByXTi-ls5DclnbQXyNFBqziBYHeTO

标签： r csv data-binding bind

【解决方案1】：

这就是你所追求的吗？

library(tidyverse)

filePattern <- "\\.csv$"
fileList <- list.files(path = "./Csv test/", recursive = FALSE,
                       pattern = filePattern, full.names = TRUE)

read_file_custom <- function(fileName) {

  # skip 6 lines and select only the first 5 columns
  dat <- readr::read_csv(file = fileName, skip = 6, col_names = FALSE) %>% 
    select(., 1:5) 

  colName <- c("TimeFrame", "Left(squeeze)", "Left(pull)", "Right(squeeze)", "Right(pull)")
  names(dat) <- colName

  # now read the 3rd and 4th lines & keep only the first 6 columns
  indi_info <- readr::read_csv(file = fileName, skip = 2, col_names = TRUE, n_max = 1) %>% 
    select(., 1:6)

  # transfer individual data to dat
  dat <- dat %>% 
    mutate(NAME   = indi_info$NAME,
           DATE   = indi_info$DATE,
           TIME   = indi_info$TIME,
           DEVICE = indi_info$DEVICE,
           MODE   = indi_info$MODE,
           TEST   = indi_info$TEST)

  return(dat)
}

# Loop through all the files using map_df, read data 
# and create a FileName column to store filenames
# Clean up filename: remove file path and extension
# Bind all files together

result <- fileList %>%
  purrr::set_names(nm = (basename(.) %>% tools::file_path_sans_ext())) %>%
  purrr::map_df(read_file_custom, .id = "FileName") 
result

#> # A tibble: 10,460 x 12
#>    FileName        TimeFrame `Left(squeeze)` `Left(pull)` `Right(squeeze)`
#>    <chr>               <dbl>           <dbl>        <dbl>            <dbl>
#>  1 Arianne-Robill~    0.0200          -1.00          2.25           -1.25 
#>  2 Arianne-Robill~    0.0400          -1.00          2.25           -1.25 
#>  3 Arianne-Robill~    0.0600          -1.00          2.25           -1.25 
#>  4 Arianne-Robill~    0.0800          -1.00          2.25           -1.25 
#>  5 Arianne-Robill~    0.100           -1.00          2.25           -1.25 
#>  6 Arianne-Robill~    0.120           -1.00          2.00           -1.25 
#>  7 Arianne-Robill~    0.140           -1.00          2.00           -1.00 
#>  8 Arianne-Robill~    0.160           -0.750         2.00           -1.00 
#>  9 Arianne-Robill~    0.180           -0.750         1.75           -0.750
#> 10 Arianne-Robill~    0.200           -0.750         1.75           -0.750
#> # ... with 10,450 more rows, and 7 more variables: `Right(pull)` <dbl>,
#> #   NAME <chr>, DATE <chr>, TIME <time>, DEVICE <chr>, MODE <chr>,
#> #   TEST <chr>

由reprex package (v0.2.0) 于 2018 年 3 月 28 日创建。

【讨论】：