【问题标题】:Duplicate values in r raster package zonal functionr 栅格包区域函数中的重复值
【发布时间】:2023-03-27 14:02:01
【问题描述】:

你好光栅战士!

经过数月的数据处理后,我对自己的结果感到头疼。我是 R 和空间分析的新手,但对我的学习过程非常满意。

这是我的问题:一旦我将 zonal 函数应用于一组 5 个栅格对象 (.tif),我会以某种方式在我认为为零的区域中获得重复值。此外,一些值出现在其他单元格中。

这是我的代码:

folder <- "my_directory"

#for one country
ET<-raster("my_directory/file_ET.tif")
ED<-raster("my_directory/file_ED.tif")
SH<-raster("my_directory/file_SH.tif")
SC<-raster("my_directory/file_SC.tif") 
WH<-raster("my_directory/file_WH.tif")

#zones
lowbound<- seq(0,cellStats(ED,"max", na.rm=TRUE),1000)
upbound<-lowbound+1000
band<-upbound/1000
rclmat <- as.matrix(cbind(lowbound,upbound,band)) #to be used on reclassify
rclmat<-rclmat[1:length(rclmat[,1])-1,]
edb<- reclassify(ED, rclmat)
# zonal statistics # I already tried this: na.rm=FALSE, !anyDuplicated(sum_ET)) to avoid duplication – did not work 
sum_ET<-zonal(ET,edb,"sum") #,  na.rm=FALSE, !anyDuplicated(sum_ET))

stats<-cbind(band,upbound,sum_ET[,2], sum_SH[,2], sum_SC[,2], sum_WH[,2],perc_SH,perc_SC,perc_WH,perc_Tot)

这是我的结果:

    upbound ET
1   1000    10523272.53
2   2000    5156046.27
3   3000    5053895.54
4   4000    4796505.3
5   5000    4392162.97
6   6000    4156065.87
31  31000   10523272.53
32  32000   5156046.27
33  33000   5053895.54
34  34000   4796505.3
35  35000   4392162.97
36  36000   4156065.87

从 31 到 36 与从 1 到 6 相同。

这是我的一位同事的结果——我正在与之比较

            upbound ET
1   1000    10523272.53
2   2000    5156046.27
3   3000    5053895.54
4   4000    4796505.3
5   5000    4392162.97
6   6000    4156065.87
31  31000   0
32  32000   21247.26
33  33000   0
34  34000   0
35  35000   0
36  36000   23877.74

如您所见,我得到了重复的值。

输入可以在这里找到ET, ED files

非常感谢任何帮助。非常感谢

迭戈

【问题讨论】:

    标签: r gis raster


    【解决方案1】:

    首先让我通过一个简单的例子来解释你的代码中发生了什么:

    # Create "zone" variable
    zone=1:10
    
    # Create ET dataframe, that does not have some zones:
    dET <- data.frame(zone=c(1:5,7,8), ET=runif(7, min=1, max=10)) 
    dET
    #   zone       ET
    # 1    1 5.776128
    # 2    2 9.067579
    # 3    3 9.874737
    # 4    4 7.846662
    # 5    5 3.588964
    # 6    7 3.843509
    # 7    8 8.916714
    
    #merge zone vector and the second column from dET dataframe:
    # As you can see some values in the second columnd of the result are repeated to match
    # the length of the zone vector and the value that originally belonged to 7th zone was moved into 6th zone
    # since there was no 6th zone in the dataframe
    cbind(zone, dET[,2])
    #      zone         
    # [1,]    1 5.776128
    # [2,]    2 9.067579
    # [3,]    3 9.874737
    # [4,]    4 7.846662
    # [5,]    5 3.588964
    # [6,]    6 3.843509
    # [7,]    7 8.916714
    # [8,]    8 5.776128
    # [9,]    9 9.067579
    # [10,]   10 9.874737
    
    #Instead you should use merge and then fill the missing values with zeros:
    result <- merge(data.frame(zone=zone), dET, by="zone", all=TRUE)
    result
    #    zone       ET
    # 1     1 5.776128
    # 2     2 9.067579
    # 3     3 9.874737
    # 4     4 7.846662
    # 5     5 3.588964
    # 6     6       NA
    # 7     7 3.843509
    # 8     8 8.916714
    # 9     9       NA
    # 10   10       NA
    
    result$ET[is.na(result$ET)] <- 0
    #    zone       ET
    # 1     1 5.776128
    # 2     2 9.067579
    # 3     3 9.874737
    # 4     4 7.846662
    # 5     5 3.588964
    # 6     6 0.000000
    # 7     7 3.843509
    # 8     8 8.916714
    # 9     9 0.000000
    # 10   10 0.000000
    

    您可以通过以下方式更正您的代码。在这里,我只使用了您的 2 个文件 ED 和 ET。所有其他的都应该以同样的方式修改:

    library(raster)
    library(rgdal)
    
    ED<-raster("file_ED.tif")
    ET<-raster("file_ET.tif")
    
    lowbound<- seq(0,cellStats(ED,"max", na.rm=TRUE),1000)
    upbound<-lowbound+1000
    band<-upbound/1000
    rclmat <- as.matrix(cbind(lowbound,upbound,band)) #to be used on reclassify
    
    # Commented the following line since cellStats(ED,"max", na.rm=TRUE)=35125.57
    # so you should not remove the values in the range 35K-36K
    #rclmat<-rclmat[1:length(rclmat[,1])-1,]
    
    ener_dens_band<- reclassify(ED, rclmat)
    sum_ET<-zonal(ET,ener_dens_band,"sum")
    
    # Let's look at the tail of the result:
    tail(sum_ET)
    #       zone      sum
    # [25,]   26 27011.60
    # [26,]   27 53905.17
    # [27,]   28 18490.15
    # [28,]   29 19322.07
    # [29,]   32 21247.26
    # [30,]   36 23877.74
    
    # As you can see some zones are missing: 30, 31, 33-35
    # So they need to be added
    sum_ET <- merge(data.frame(zone=1:36), sum_ET, by="zone", all=TRUE)
    sum_ET$sum[is.na(sum_ET$sum) ] <- 0
    
    tail(sum_ET, n=10)
    #    zone      sum
    # 27   27 53905.17
    # 28   28 18490.15
    # 29   29 19322.07
    # 30   30     0.00
    # 31   31     0.00
    # 32   32 21247.26
    # 33   33     0.00
    # 34   34     0.00
    # 35   35     0.00
    # 36   36 23877.74
    

    您需要对其他列重复相同的操作,然后您可以使用 cbind 将它们全部放在一个数据帧中。

    【讨论】:

    • 非常感谢卡蒂亚。这对我帮助很大!
    • @DiegoMoya 太棒了!如果这回答了您的问题,请“接受”答案。谢谢。
    猜你喜欢
    • 2011-08-05
    • 1970-01-01
    • 2020-11-12
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-04-19
    • 1970-01-01
    • 2015-10-23
    相关资源
    最近更新 更多