【问题标题】:Slice 2d array into smaller 2d arrays将二维数组切片成更小的二维数组
【发布时间】:2018-05-06 16:46:42
【问题描述】:

有没有办法将 numpy 中的二维数组分割成更小的二维数组?

示例

[[1,2,3,4],   ->    [[1,2] [3,4]   
 [5,6,7,8]]          [5,6] [7,8]]

所以我基本上想将一个 2x4 数组缩减为 2 个 2x2 数组。正在寻找用于图像的通用解决方案。

【问题讨论】:

    标签: python numpy


    【解决方案1】:

    现在它只适用于大二维数组可以完美分割成大小相等的子数组。

    下面的代码切片

    a ->array([[ 0,  1,  2,  3,  4,  5],
               [ 6,  7,  8,  9, 10, 11],
               [12, 13, 14, 15, 16, 17],
               [18, 19, 20, 21, 22, 23]])
    

    进入这个

    block_array->
        array([[[ 0,  1,  2],
                [ 6,  7,  8]],
    
               [[ 3,  4,  5],
                [ 9, 10, 11]],
    
               [[12, 13, 14],
                [18, 19, 20]],
    
               [[15, 16, 17],
                [21, 22, 23]]])
    

    pangq确定块大小

    代码

    a = arange(24)
    a = a.reshape((4,6))
    m = a.shape[0]  #image row size
    n = a.shape[1]  #image column size
    
    p = 2     #block row size
    q = 3     #block column size
    
    block_array = []
    previous_row = 0
    for row_block in range(blocks_per_row):
        previous_row = row_block * p   
        previous_column = 0
        for column_block in range(blocks_per_column):
            previous_column = column_block * q
            block = a[previous_row:previous_row+p,previous_column:previous_column+q]
            block_array.append(block)
    
    block_array = array(block_array)
    

    【讨论】:

      【解决方案2】:

      在我看来,这是numpy.split 或其他变体的任务。

      例如

      a = np.arange(30).reshape([5,6])  #a.shape = (5,6)
      a1 = np.split(a,3,axis=1) 
      #'a1' is a list of 3 arrays of shape (5,2)
      a2 = np.split(a, [2,4])
      #'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)
      

      如果您有 NxN 图像,您可以创建例如 2 NxN/2 个子图像的列表,然后将它们沿另一个轴划分。

      numpy.hsplitnumpy.vsplit 也可用。

      【讨论】:

        【解决方案3】:

        几个月前有another question 让我想到了使用reshapeswapaxes 的想法。 h//nrows 是有意义的,因为它将第一个块的行保持在一起。您需要 nrowsncols 成为形状的一部分也是有道理的。 -1 告诉 reshape 填写使 reshape 有效所需的任何数字。有了解决方案的形式,我只是尝试了一些东西,直到找到有效的公式。

        您应该能够使用reshapeswapaxes 的某种组合将数组分成“块”:

        def blockshaped(arr, nrows, ncols):
            """
            Return an array of shape (n, nrows, ncols) where
            n * nrows * ncols = arr.size
        
            If arr is a 2D array, the returned array should look like n subblocks with
            each subblock preserving the "physical" layout of arr.
            """
            h, w = arr.shape
            assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
            assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
            return (arr.reshape(h//nrows, nrows, -1, ncols)
                       .swapaxes(1,2)
                       .reshape(-1, nrows, ncols))
        

        c

        np.random.seed(365)
        c = np.arange(24).reshape((4, 6))
        print(c)
        
        [out]:
        [[ 0  1  2  3  4  5]
         [ 6  7  8  9 10 11]
         [12 13 14 15 16 17]
         [18 19 20 21 22 23]]
        

        进入

        print(blockshaped(c, 2, 3))
        
        [out]:
        [[[ 0  1  2]
          [ 6  7  8]]
        
         [[ 3  4  5]
          [ 9 10 11]]
        
         [[12 13 14]
          [18 19 20]]
        
         [[15 16 17]
          [21 22 23]]]
        

        我发布了一个inverse function, unblockshaped, here 和一个N 维概括here。概括让我们更深入地了解了该算法背后的推理。


        请注意,还有superbatfish's blockwise_view。它安排了 不同格式的块(使用更多轴),但它具有 (1) 的优点 总是返回一个视图和(2)能够处理任何数组 维度。

        【讨论】:

        • 绝妙的解决方案!
        【解决方案4】:

        还有一些其他答案似乎已经非常适合您的具体情况,但是您的问题激起了我对内存高效解决方案的兴趣,该解决方案可用于 numpy 支持的最大维度数,我最终下午的大部分时间都在想出可能的方法。 (方法本身比较简单,只是我还没有使用 numpy 支持的大部分真正花哨的功能,所以大部分时间都花在研究 numpy 有什么可用以及它可以做多少,所以我没有'不必这样做。)

        def blockgen(array, bpa):
            """Creates a generator that yields multidimensional blocks from the given
        array(_like); bpa is an array_like consisting of the number of blocks per axis
        (minimum of 1, must be a divisor of the corresponding axis size of array). As
        the blocks are selected using normal numpy slicing, they will be views rather
        than copies; this is good for very large multidimensional arrays that are being
        blocked, and for very large blocks, but it also means that the result must be
        copied if it is to be modified (unless modifying the original data as well is
        intended)."""
            bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray
        
            # parameter checking
            if array.ndim != bpa.size:         # bpa doesn't match array dimensionality
                raise ValueError("Size of bpa must be equal to the array dimensionality.")
            if (bpa.dtype != np.int            # bpa must be all integers
                or (bpa < 1).any()             # all values in bpa must be >= 1
                or (array.shape % bpa).any()): # % != 0 means not evenly divisible
                raise ValueError("bpa ({0}) must consist of nonzero positive integers "
                                 "that evenly divide the corresponding array axis "
                                 "size".format(bpa))
        
        
            # generate block edge indices
            rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
                    for i, blk_n in enumerate(bpa))
        
            # build slice sequences for each axis (unfortunately broadcasting
            # can't be used to make the items easy to operate over
            c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]
        
            # Now to get the blocks; this is slightly less efficient than it could be
            # because numpy doesn't like jagged arrays and I didn't feel like writing
            # a ufunc for it.
            for idxs in np.ndindex(*bpa):
                blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))
        
                yield array[blockbounds]
        

        【讨论】:

          【解决方案5】:

          你提问practically the same as this one。您可以使用带有np.ndindex()reshape() 的单线:

          def cutter(a, r, c):
              lenr = a.shape[0]/r
              lenc = a.shape[1]/c
              np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)
          

          创建你想要的结果:

          a = np.arange(1,9).reshape(2,1)
          #array([[1, 2, 3, 4],
          #       [5, 6, 7, 8]])
          
          cutter( a, 1, 2 )
          #array([[[[1, 2]],
          #        [[3, 4]]],
          #       [[[5, 6]],
          #        [[7, 8]]]])
          

          【讨论】:

            【解决方案6】:

            如果您想要一个解决方案也可以处理矩阵为 不均分,可以这样用:

            from operator import add
            half_split = np.array_split(input, 2)
            
            res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
            res = reduce(add, res)
            

            【讨论】:

              【解决方案7】:

              这是一个基于 unutbu 回答的解决方案,用于处理矩阵不能均分的情况。在这种情况下,它会在使用一些插值之前调整矩阵的大小。为此,您需要 OpenCV。请注意,我必须交换 ncolsnrows 才能使其正常工作,但不知道为什么。

              import numpy as np
              import cv2
              import math 
              
              def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
                  """
                  arr      a 2D array, typically an image
                  r_nbrs   numbers of rows
                  r_cols   numbers of cols
                  """
              
                  arr_h, arr_w = arr.shape
              
                  size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
                  size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )
              
                  if size_w != arr_w or size_h != arr_h:
                      arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)
              
                  nrows = int(size_w // r_nbrs)
                  ncols = int(size_h // c_nbrs)
              
                  return (arr.reshape(r_nbrs, ncols, -1, nrows) 
                             .swapaxes(1,2)
                             .reshape(-1, ncols, nrows))
              

              【讨论】:

                【解决方案8】:

                对 TheMeaningfulEngineer 的回答进行了一些小的改进,以处理大型二维数组 无法 被完美地分割成大小相等的子数组的情况

                def blockfy(a, p, q):
                    '''
                    Divides array a into subarrays of size p-by-q
                    p: block row size
                    q: block column size
                    '''
                    m = a.shape[0]  #image row size
                    n = a.shape[1]  #image column size
                
                    # pad array with NaNs so it can be divided by p row-wise and by q column-wise
                    bpr = ((m-1)//p + 1) #blocks per row
                    bpc = ((n-1)//q + 1) #blocks per column
                    M = p * bpr
                    N = q * bpc
                
                    A = np.nan* np.ones([M,N])
                    A[:a.shape[0],:a.shape[1]] = a
                
                    block_list = []
                    previous_row = 0
                    for row_block in range(bpc):
                        previous_row = row_block * p   
                        previous_column = 0
                        for column_block in range(bpr):
                            previous_column = column_block * q
                            block = A[previous_row:previous_row+p, previous_column:previous_column+q]
                
                            # remove nan columns and nan rows
                            nan_cols = np.all(np.isnan(block), axis=0)
                            block = block[:, ~nan_cols]
                            nan_rows = np.all(np.isnan(block), axis=1)
                            block = block[~nan_rows, :]
                
                            ## append
                            if block.size:
                                block_list.append(block)
                
                    return block_list
                

                例子:

                a = np.arange(25)
                a = a.reshape((5,5))
                out = blockfy(a, 2, 3)
                
                a->
                array([[ 0,  1,  2,  3,  4],
                       [ 5,  6,  7,  8,  9],
                       [10, 11, 12, 13, 14],
                       [15, 16, 17, 18, 19],
                       [20, 21, 22, 23, 24]])
                
                out[0] ->
                array([[0., 1., 2.],
                       [5., 6., 7.]])
                
                out[1]->
                array([[3., 4.],
                       [8., 9.]])
                
                out[-1]->
                array([[23., 24.]])
                

                【讨论】:

                  【解决方案9】:
                  a = np.random.randint(1, 9, size=(9,9))
                  out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
                  print(a)
                  print(out)
                  

                  产量

                  [[7 6 2 4 4 2 5 2 3]
                   [2 3 7 6 8 8 2 6 2]
                   [4 1 3 1 3 8 1 3 7]
                   [6 1 1 5 7 2 1 5 8]
                   [8 8 7 6 6 1 8 8 4]
                   [6 1 8 2 1 4 5 1 8]
                   [7 3 4 2 5 6 1 2 7]
                   [4 6 7 5 8 2 8 2 8]
                   [6 6 5 5 6 1 2 6 4]]
                  [[array([[7, 6, 2],
                         [2, 3, 7],
                         [4, 1, 3]]), array([[4, 4, 2],
                         [6, 8, 8],
                         [1, 3, 8]]), array([[5, 2, 3],
                         [2, 6, 2],
                         [1, 3, 7]])], [array([[6, 1, 1],
                         [8, 8, 7],
                         [6, 1, 8]]), array([[5, 7, 2],
                         [6, 6, 1],
                         [2, 1, 4]]), array([[1, 5, 8],
                         [8, 8, 4],
                         [5, 1, 8]])], [array([[7, 3, 4],
                         [4, 6, 7],
                         [6, 6, 5]]), array([[2, 5, 6],
                         [5, 8, 2],
                         [5, 6, 1]]), array([[1, 2, 7],
                         [8, 2, 8],
                         [2, 6, 4]])]]
                  

                  【讨论】:

                    【解决方案10】:

                    我发布我的解决方案。请注意,此代码实际上并没有创建原始数组的副本,因此它适用于大数据。此外,如果数组不能被平均划分,它也不会崩溃(但您可以通过删除ceil 并检查v_slicesh_slices 是否被不休息地划分来轻松添加条件。

                    import numpy as np
                    from math import ceil
                    
                    a = np.arange(9).reshape(3, 3)
                    
                    p, q = 2, 2
                    width, height = a.shape
                    
                    v_slices = ceil(width / p)
                    h_slices = ceil(height / q)
                    
                    for h in range(h_slices):
                        for v in range(v_slices):
                            block = a[h * p : h * p + p, v * q : v * q + q]
                            # do something with a block
                    

                    这段代码改变了(或者,更准确地说,让你可以直接访问数组的一部分):

                    [[0 1 2]
                     [3 4 5]
                     [6 7 8]]
                    

                    进入这个:

                    [[0 1]
                     [3 4]]
                    [[2]
                     [5]]
                    [[6 7]]
                    [[8]]
                    

                    如果您需要实际副本,Aenaon code 就是您要找的。​​p>

                    如果你确定大数组可以平分,可以使用numpy splitting工具。

                    【讨论】:

                      【解决方案11】:

                      添加到@Aenaon 答案和他的 blockfy 功能,如果您正在使用 COLOR IMAGES/3D ARRAY 这是我为 3 通道输入创建 224 x 224 作物的管道

                      def blockfy(a, p, q):
                      '''
                      Divides array a into subarrays of size p-by-q
                      p: block row size
                      q: block column size
                      '''
                      m = a.shape[0]  #image row size
                      n = a.shape[1]  #image column size
                      
                      # pad array with NaNs so it can be divided by p row-wise and by q column-wise
                      bpr = ((m-1)//p + 1) #blocks per row
                      bpc = ((n-1)//q + 1) #blocks per column
                      M = p * bpr
                      N = q * bpc
                      
                      A = np.nan* np.ones([M,N])
                      A[:a.shape[0],:a.shape[1]] = a
                      
                      block_list = []
                      previous_row = 0
                      for row_block in range(bpc):
                          previous_row = row_block * p   
                          previous_column = 0
                          for column_block in range(bpr):
                              previous_column = column_block * q
                              block = A[previous_row:previous_row+p, previous_column:previous_column+q]
                      
                              # remove nan columns and nan rows
                              nan_cols = np.all(np.isnan(block), axis=0)
                              block = block[:, ~nan_cols]
                              nan_rows = np.all(np.isnan(block), axis=1)
                              block = block[~nan_rows, :]
                      
                              ## append
                              if block.size:
                                  block_list.append(block)
                      
                      return block_list
                      

                      然后扩展到上面

                      for file in os.listdir(path_to_crop):   ### list files in your folder
                         img = io.imread(path_to_crop + file, as_gray=False) ### open image 
                      
                         r = blockfy(img[:,:,0],224,224)  ### crop blocks of 224 x 224 for red channel
                         g = blockfy(img[:,:,1],224,224)  ### crop blocks of 224 x 224 for green channel
                         b = blockfy(img[:,:,2],224,224)  ### crop blocks of 224 x 224 for blue channel
                      
                         for x in range(0,len(r)):
                             img = np.array((r[x],g[x],b[x])) ### combine each channel into one patch by patch
                      
                             img = img.astype(np.uint8) ### cast back to proper integers
                      
                             img_swap = img.swapaxes(0, 2) ### need to swap axes due to the way things were proceesed
                             
                             img_swap_2 = img_swap.swapaxes(0, 1) ### do it again
                      
                             Image.fromarray(img_swap_2).save(path_save_crop+str(x)+"bounding" + file,
                                                              format = 'jpeg',
                                                              subsampling=0,
                                                              quality=100) ### save patch with new name etc 
                      

                      【讨论】:

                        猜你喜欢
                        • 1970-01-01
                        • 1970-01-01
                        • 2020-01-04
                        • 1970-01-01
                        • 1970-01-01
                        • 1970-01-01
                        • 1970-01-01
                        相关资源
                        最近更新 更多