【问题标题】:How to manually implement padding for pytorch convolutions如何手动实现 pytorch 卷积的填充
【发布时间】:2020-12-10 00:16:20
【问题描述】:

我正在尝试将一些 pytorch 代码移植到 tensorflow 2.0,并且很难弄清楚如何在两者之间转换卷积函数。两个库处理填充的方式是症结所在。基本上,我想了解如何手动生成 pytorch 在后台执行的填充,以便将其转换为 tensorflow。

如果我不做任何填充,下面的代码就可以工作,但是一旦添加了任何填充,我就无法弄清楚如何使这两个实现匹配。

output_padding = SOME NUMBER
padding = SOME OTHER NUMBER
strides = 128

tensor = np.random.rand(2, 258, 249)
filters = np.random.rand(258, 1, 256)

out_torch = F.conv_transpose1d(
    torch.from_numpy(tensor).float(),
    torch.from_numpy(filters).float(),
    stride=strides,
    padding=padding,
    output_padding=output_padding)

def pytorch_transpose_conv1d(inputs, filters, strides, padding, output_padding):
    N, L_in = inputs.shape[0], inputs.shape[2]
    out_channels, kernel_size = filters.shape[1], filters.shape[2]
    time_out = (L_in - 1) * strides - 2 * padding + (kernel_size - 1) + output_padding + 1
    padW = (kernel_size - 1) - padding
    
    # HOW DO I PAD HERE TO GET THE SAME OUTPUT AS IN PYTORCH
    inputs = tf.pad(inputs, [(?, ?), (?, ?), (?, ?)])

    return tf.nn.conv1d_transpose(
        inputs,
        tf.transpose(filters, perm=(2, 1, 0)),
        output_shape=(N, out_channels, time_out),
        strides=strides,
        padding="VALID",
        data_format="NCW")

out_tf = pytorch_transpose_conv1d(tensor, filters, strides, padding, output_padding)
assert np.allclose(out_tf.numpy(), out_torch.numpy())

【问题讨论】:

    标签: python tensorflow pytorch tensorflow2.0


    【解决方案1】:

    填充


    为了在 PytorchTensorflow 之间转换 卷积和转置卷积 函数(带有填充 padding),我们需要先了解F.pad()tf.pad()函数。

    torch.nn.functional.pad(input, padding_size, mode='constant', value=0):

    • padding size:从last dimension 开始描述填充某些输入维度的填充大小。
    • 只填充输入张量的last dimension,然后填充形式为(padding_left, padding_right)
    • 填充last 3 dimensions(padding_left, padding_right, padding_top, padding_bottom, padding_front,padding_back)

    tensorflow.pad(input, padding_size, mode='CONSTANT',name=None,constant_values=0)

    • padding_size: 是一个整数张量,形状为[n, 2],其中 n 是张量的秩。对于输入的每个维度D,paddings[D, 0]表示在该维度的张量内容之前要添加多少个值,paddings[D, 1]表示在该维度的张量内容之后要添加多少个值。李>

    这是代表 F.pad 和 tf.pad 等效项的表格,以及输入张量的输出张量
    [[[1, 1], [1, 1]]],其形状为 (1, 2, 2)


    卷积中的填充


    现在让我们转到 卷积 层中的 PyTorch 填充

    1. F.conv1d(input, ..., padding, ...):

      • padding 控制both sides 上的隐式填充量,用于填充点数。
      • padding=(size) 应用 F.pad(input, [size, size]) 即填充 last 尺寸,(size, size) 等效于 tf.pad(input, [[0, 0], [0, 0], [size, size]])
    2. F.conv2d(input, ..., padding, ...):

      • padding=(size) 应用 F.pad(input, [size, size, size, size]) 即填充 last 2 尺寸,(尺寸,尺寸)相当于tf.pad(input, [[0, 0], [size, size], [size, size]])
      • padding=(size1, size2) 应用 F.pad(input, [size2, size2, size1, size1]) 相当于 tf.pad(input, [[0, 0], [size1, size1], [size2, size2]])

    转置卷积中的填充


    转置卷积层中的 PyTorch 填充

    1. F.conv_transpose1d(input, ..., padding, output_padding, ...):
      • dilation * (kernel_size - 1) - padding 填充将添加到输入中每个维度的 both 边。
      • transposed 中的Padding 卷积可以看作是分配fake 输出,即removed
      • output_padding 控制添加到输出形状一侧的附加大小
      • 检查 this 以了解在transpose convolutionpytorch 期间究竟发生了什么。
      • 这里是计算转置卷积输出大小的公式:

    output_size = (input_size - 1)步幅 + (kerenel_size - 1) + 1 + output_padding - 2填充


    代码


    转置卷积

    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import tensorflow as tf
    import numpy as np
    
    # to stop tf checkfailed error not relevent to actual code
    import os
    os.environ["CUDA_DEVICE_ORDER"]    = "PCI_BUS_ID"   
    os.environ["CUDA_VISIBLE_DEVICES"] = "1"
    
    
    
    
    def tconv(tensor, filters, output_padding=0, padding=0, strides=1):
        '''
        tensor         : input tensor of shape (batch_size, channels, W) i.e (NCW)
        filters        : input kernel of shape (in_ch, out_ch, kernel_size)
        output_padding : single number must be smaller than either stride or dilation
        padding        : single number should be less or equal to ((valid output size + output padding) // 2)
        strides        : single number
        '''
        bs, in_ch, W = tensor.shape
        in_ch, out_ch, k_sz = filters.shape
        
        out_torch = F.conv_transpose1d(torch.from_numpy(tensor).float(), 
                                       torch.from_numpy(filters).float(),
                                       stride=strides, padding=padding, 
                                       output_padding=output_padding)
        out_torch = out_torch.numpy()
     
        # output_size = (input_size - 1)*stride + (kerenel_size - 1) + 1 + output_padding - 2*padding
        # valid out size -> padding=0, output_padding=0 
        # -> valid_out_size =  (input_size - 1)*stride + (kerenel_size - 1) + 1
        out_size  = (W - 1)*strides + (k_sz - 1) + 1 
    
        # input shape -> (batch_size, W, in_ch) and filters shape -> (kernel_size, out_ch, in_ch) for tf conv
        valid_tf  = tf.nn.conv1d_transpose(np.transpose(tensor, axes=(0, 2, 1)), 
                                           np.transpose(filters, axes=(2, 1, 0)), 
                                           output_shape=(bs, out_size, out_ch), 
                                           strides=strides, padding='VALID', 
                                           data_format='NWC')
        # output padding
        tf_outpad = tf.pad(valid_tf, [[0, 0], [0, output_padding], [0, 0]])
        # NWC to NCW
        tf_outpad = np.transpose(tf_outpad, (0, 2, 1))
    
        # padding -> input, begin, shape -> remove `padding` elements on both side
        out_tf    = tf.slice(tf_outpad, [0, 0, padding], [bs, out_ch, tf_outpad.shape[2]-2*padding])
    
        out_tf    = np.array(out_tf)
    
        print('output size(tf, torch):', out_tf.shape, out_torch.shape)
        # print('out_torch:\n', out_torch)
        # print('out_tf:\n', out_tf)
        print('outputs are close:', np.allclose(out_tf, out_torch))
    
    
    
    tensor  = np.random.rand(2, 1, 7)
    filters = np.random.rand(1, 2, 3)
    tconv(tensor, filters, output_padding=2, padding=5, strides=3)
    

    结果

    >>> tensor  = np.random.rand(2, 258, 249)
    >>> filters = np.random.rand(258, 1, 7)
    >>> tconv(tensor, filters, output_padding=4, padding=9, strides=6)
    output size(tf, torch): (2, 1, 1481) (2, 1, 1481)
    outputs are close: True
    

    一些有用的链接:

    1. pytorch'SAME'卷积

    2. pytorch transpose conv 的工作原理

    【讨论】:

    • 精彩的答案!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-02-16
    • 2020-11-04
    • 2019-12-09
    • 1970-01-01
    • 1970-01-01
    • 2021-06-12
    • 1970-01-01
    相关资源
    最近更新 更多