【问题标题】:How to optimize this Tensorflow weighted average custom op如何优化这个 Tensorflow 加权平均自定义操作
【发布时间】:2019-05-10 04:47:18
【问题描述】:

我正在尝试根据本文https://thijsvogels.nl/kpcn/bako2017kpcn.pdf实现一个Tensorflow op来执行加权平均

运算是计算图像中像素的平均值,其权重乘以相邻像素的值。

我想寻求任何建议来优化此代码,因为当前的实现速度很慢。

inputs.shape() 是 [1, 740, 1300, 3]

weights.shape() 是 [1, 720, 1280, 441]

def weighted_average(inputs, weights):
    with tf.name_scope("weighted_average", "weighted_average", [inputs, weights]) as scope:
        in_shape = inputs.get_shape().as_list()
        w_shape = weights.get_shape().as_list()

        n_channels = in_shape[3]
        xs = tf.split(inputs, n_channels, axis=3)

        pad = (in_shape[1] - w_shape[1]) // 2

        kernel_size = pad * 2 + 1

        for index in range(n_channels):
            x = xs[index]

            x_stack = []
            for i in range(kernel_size):
                for j in range(kernel_size):
                    x_stack.append( x[:, i:x.shape[1] - 2 * pad + i, j:x.shape[2] - 2 * pad + j, :] )

            x_stack = tf.concat(x_stack, axis=3)
            x = tf.reduce_sum(tf.multiply(x_stack, weights), axis=3, keep_dims=True)

            xs[index] = x

        return tf.concat(xs, axis=3)

【问题讨论】:

    标签: python tensorflow


    【解决方案1】:

    使用tf.device('/cpu:0') 强制在 CPU 中计算运算并使用 Eigen lib 使其更快。

    如果在 GPU 中计算,我认为它可能与所有张量变换有关。

    def weighted_averagex(inputs, weights):
        with tf.name_scope("weighted_average", "weighted_average", [inputs, weights]) as scope:
          with tf.device('/cpu:0'):
            in_shape = inputs.get_shape().as_list()
            w_shape = weights.get_shape().as_list()
    
            n_channels = in_shape[3]
            xs = tf.split(inputs, n_channels, axis=3)
    
            pad = (in_shape[1] - w_shape[1]) // 2
    
            kernel_size = pad * 2 + 1
    
            for index in range(n_channels):
                x = xs[index]
    
                x_stack = []
                for i in range(kernel_size):
                    for j in range(kernel_size):
                        x_stack.append( x[:, i:x.shape[1] - 2 * pad + i, j:x.shape[2] - 2 * pad + j, :] )
    
                x_stack = tf.concat(x_stack, axis=3)
                x = tf.reduce_sum(tf.multiply(x_stack, weights), axis=3, keep_dims=True)
    
                xs[index] = x
    
          return tf.concat(xs, axis=3)
    

    【讨论】:

      猜你喜欢
      • 2023-04-08
      • 2020-10-02
      • 1970-01-01
      • 2020-03-05
      • 1970-01-01
      • 2019-07-08
      • 2014-01-26
      • 1970-01-01
      相关资源
      最近更新 更多