如何在 python 中合并两个或三个 3D 数组？答案

【问题标题】：How to merge two or three 3D arrays in python?如何在 python 中合并两个或三个 3D 数组？
【发布时间】：2023-03-04 14:43:01
【问题描述】：

我有 hdf 格式的时间序列数据。我使用下面的代码从 hdf 文件中读取数据。现在，我尝试根据经纬度加入具有相同 jdn（儒略日数）的数据。具有相同儒略日数的数据表示连续的空间数据

import glob
import numpy as np
import os
from pyhdf.SD import SD,SDC


files = glob.glob('MOD04*')
files.sort()
for f in files:
    product = f[0:5]+ '-Atmospheric Product'
    year = f[10:14]
    jdn = f[14:17] # julian day number

    # Read dataset.
    hdf = SD(f, SDC.READ)
    data3D = hdf.select('Deep_Blue_Aerosol_Optical_Depth_550_Land')
    data = data3D[:,:].astype(np.double)

    # Read geolocation dataset 
    lat = hdf.select('Latitude')
    latitude = lat[:,:]
    lon = hdf.select('Longitude')
    longitude = lon[:,:]

我的数据附在这个链接中：https://drive.google.com/folderview?id=0B2rkXkOkG7ExX2lTTWEySU1fOWc&usp=sharing

【问题讨论】：

你的最终目标是什么？（即一个大数据数组，其中包含来自每个文件的顺序信息？还有别的吗？）
@Heather QC 我的最终目标是获取每日时间序列数据集。所以我尝试合并具有相同朱利安天但无法成功的文件中的数据。正如你所说的一个大数据数组，其中包含来自每个文件的顺序信息:)。

标签： python numpy pandas hdf pyhdf

【解决方案1】：

只是为了跟进 Heather QC 的回答，这里是 np.stack 函数的说明以及涉及哪些维度：

arr1 = np.array([[[1,2],[2,3]],
                 [[1,2],[2,3]],
                 [[1,2],[2,3]]])

arr2 = np.array([[[5,6],[8,7]],
                 [[7,6],[7,8]],
                 [[6,7],[7,8]]])

print("arr1 shape  ", arr1.shape)    
print("arr2 shape  ", arr2.shape)    
print("vstack shape", np.vstack((arr1, arr2)).shape)
print("hstack shape", np.hstack((arr1, arr2)).shape)
print("dstack shape", np.dstack((arr1, arr2)).shape)

>>> arr1 shape   (3, 2, 2)
>>> arr2 shape   (3, 2, 2)
>>> vstack shape (6, 2, 2)
>>> hstack shape (3, 4, 2)
>>> dstack shape (3, 2, 4)

【讨论】：

【解决方案2】：

Numpy 的 hstack、vstack 或 dstack（取决于您要加入数组的轴）将加入多维数组。

请注意，特别是对于 MODIS 气溶胶数据，使用 hstack 连接数组有时会引发错误，因为有时数组是 203 x 135，有时是 204 x 135，因此水平尺寸并不总是匹配

以您的代码为基础（不漂亮，但功能强大）：

import glob
import numpy as np
import os
from pyhdf.SD import SD,SDC


files = glob.glob('MOD04*')
files.sort()
for n, f in enumerate(files):
    product = f[0:5]+ '-Atmospheric Product'
    year = f[10:14]
    jdn = f[14:17] # julian day number

    # Read dataset.
    hdf = SD(f, SDC.READ)
    data3D = hdf.select('Deep_Blue_Aerosol_Optical_Depth_550_Land')
    data = data3D[:,:].astype(np.double)

   # Read geolocation dataset 
    lat = hdf.select('Latitude')
    latitude = lat[:,:]
    lon = hdf.select('Longitude')
    longitude = lon[:,:]

    if n != 0 and jdn != old_jdn:
        #do analysis; write to file for later analysis; etc.
        pass

    if n == 0 or jdn != old_jdn:
        data_timeseries = data
        latitude_timeseries = latitude
        longitude_timeseries = longitude
    else:
        data_timeseries = np.vstack((data_timeseries, data))
        latitude_timeseries = np.vstack((latitude_timeseries, latitude))
        longitude_timeseries = np.vstack((longitude_timeseries, longitude))

    print data_timeseries.shape
    print latitude_timeseries.shape
    print longitude_timeseries.shape

    old_jdn = jdn

【讨论】：

我正在寻找类似 dstack 但无法加入 Mosaic 的东西
马赛克是什么意思？
因为这是网格数据，x 和 y 维度分别为经度和纬度。我想加入具有相同儒略日数的数据集。具有相同儒略日数的简单数据就像同一天的两个图块（图像），所以我想加入这两个图块以获得单个图块，就像在 gdal 马赛克函数中一样
我用 MODIS 气溶胶产品做了很多工作，但没有使用 gdal，所以我不清楚该功能的作用。我使用 vstack 将数据颗粒拼接在一起进行分析，但最终为每个数据字段（深蓝色 AOD、lat、lon 等）提供了单独的拼接数组。如果您尝试在单个数组中获取每个儒略日的数据 + lat + lon，希望其他人有更好的解决方案！祝你好运！
嗨 Heather QC，感谢您的 cmets 和建议...实际上我也在使用 AOD 数据集并尝试拼接但无法成功..如果您愿意，可以分享您的代码仅缝合部分..这对我和我的研究非常有帮助......我会高度承认:)