【发布时间】:2018-11-30 21:34:54
【问题描述】:
我需要从数据框的随机化中生成并保存多个文件。 原始数据框是几年的每日天气数据。我需要生成随机重组年份但保持年份顺序的文件。
我开发了一个用于随机化年份的简单代码,但我无法重复随机化并将每个输出随机数据帧保存为单独的文件。
这是我目前所拥有的:
# Create example data frame
df <- data.frame(x=c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,8,8))
df$y <- c(4,8,9,1,1,5,8,8,3,2,0,9,4,4,7,3,5,5,2,4,6,6)
df$z <- c("A","A","A","B","B","B","C","C","C","D","D","D","F","F","F","G","G","G","H","H","I","I")
set.seed(30)
# Split data frame based on info in one column (i.e. df$x) and store in a list
dt_list <- split(df, f = df$x)
# RANDOMIZE data list -- Create a new index and change the order of dt_list
# SAVE the result to "random list" (i.e. 'rd_list')
rd_list <- dt_list[sample(1:length(dt_list), length(dt_list))]
# Put back together data in the order established in 'rd_list'
rd_data <- do.call(rbind, rd_list)
这就像我需要的那样随机化数据框,但我不知道如何“保存并重复”所以我得到多个文件,比如说大约 20 个,命名为原始和顺序编号(例如 df_1、df_2 ...)。
此外,作为随机样本,有可能得到重复。有没有办法自动丢弃重复的文件?
谢谢!
【问题讨论】: