【问题标题】:automize combining data frames in R在 R 中自动组合数据帧
【发布时间】:2021-10-16 16:41:30
【问题描述】:

我开始创建我的数据框的两个子集,如下所示:

a = read_sf("a.shp")
a = st_make_valid(a)

#creating first subset with polygons from base year 2000
a1 = subset(a, year==2000)

#creating second subset with polygons from all the following years
a2 = subset(a, year>2000)

#creating buffer around base year
buffer = st_buffer(a1, 500)

#intersection of buffer from base year with polygons from the following years 
#(lets assume the column "year" in the data.frame has a range of years 2000-2010)
results2000 = st_intersection(buffer, a2)

我现在必须在 2010 年之前的每一年都执行这些步骤。所以接下来的结果将如下所示:

a = read_sf("a.shp")
a = st_make_valid(a)

#creating first subset with polygons from base year 2000
a1 = subset(a, year==2001)

#creating second subset with polygons from all the following years
a2 = subset(a, year>2001)

#creating buffer around base year
buffer = st_buffer(a1, 500)

#intersection of buffer from base year with polygons from the following years 
#(lets assume the column "year" in the data.frame has a range of years 2000-2010)
results2001 = st_intersection(buffer, a2)

唯一改变的条目是子集代码中的年份。

最后,我需要一个包含所有 10 个结果的 data.frame(结合 results2000 到 results2010,每个结果都将 2000 年到 2010 年之间的 10 年中的一年作为基准年)。

我可以创建所有 10 个结果并将它们组合起来

rbind(results2000,results2001,...) 

等等。

但是他们这样做是不是更简单? 我想过使用foreach包中的foreach函数,可能是这样的

(Spatial) Efficient way of finding all points within X meters of a point?

但是使用 foreach 创建一个循环会导致绘制时间很长,因为 data.frame "a" 包含大约 100 万行。

谁能帮忙?

【问题讨论】:

    标签: r foreach spatial


    【解决方案1】:

    我不知道你用哪个包来获取数据,所以我会假设 st_intersection(buffer, a2) 的输出是一个数据框(因为你说你想要使用 rbind)。

    如果是这样,您可以创建一个空列表(我在下面称为 outlist),然后在循环中填充该列表,最后将列表的不同元素 rbind 在一起。

    见下面的代码:

    a = read_sf("a.shp")
    a = st_make_valid(a)
    
    
    # Define your cutting year 
    subset_yr <- 2000:2010
    # Create an empty list you will populate
    outlist <- vector(mode = "list", length = length(subset_yr))
    
    # Your code in a loop
    for(i in seq_along(subset_yr)) {
      #creating first subset with polygons from base year i (2000, 2001, etc.)
      a1 = subset(a, year==subset_yr[i])
      
      #creating second subset with polygons from all the following years
      a2 = subset(a, year>subset_yr[i])
      
      #creating buffer around base year
      buffer = st_buffer(a1, 500)
      
      #intersection of buffer from base year with polygons from the following years 
      outlist[[i]] = st_intersection(buffer, a2)
      
      #Add a column indicating the year
      outlist[[i]]$Year <- subset_yr[i]
    }
    
    # Bind everything together
    rbind(outlist)
    

    【讨论】:

    • rbind(outlist) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] outlist sf,36 sf,36 sf,36 sf,36 sf,36 sf,36 sf,36 sf,36 sf,36 sf,36 sf,36 这是您的代码的输出,但是我需要一个具有原始列名(年份、相交区域......)的 data.frame 我试过 outlist&lt;-data.frame() 而不是向量,我删除了添加的年份列(因为它为每个基准年添加了 11 列)但我收到以下错误Error in [[(*tmp*, i, value = list(id = c(1, 2, : replacement has 12994 rows, data has 0
    • 你用什么包来做read_sf("a.shp"),它是一个可重现的例子吗?如果我看到 st_intersection(buffer, a2) 的结构会更容易提供帮助。您也可以在下面编写代码:results2000 = outlist[[1]]; results2001 = outlist[[2]] 等。
    • 我正在使用 sf 包,但使用您的提示 outlist[[1]] 它会为此子集生成所需的输出,然后设置 ´results2000=outlist[[1]];...` 和 rbind(results2000,results2001,...)通向我希望的出口!非常感谢!!!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2023-03-05
    • 2018-12-14
    • 1970-01-01
    • 1970-01-01
    • 2014-04-04
    • 2013-12-19
    • 2019-03-29
    相关资源
    最近更新 更多