Keras 中的 conv2d 和 Conv2D 有什么区别？答案

【问题标题】：what is the difference between conv2d and Conv2D in Keras?Keras 中的 conv2d 和 Conv2D 有什么区别？
【发布时间】：2019-09-19 08:55:51
【问题描述】：

我对 Keras 中的 Conv2D 和 conv2d 感到困惑。它们之间有什么区别？我认为第一个是层，第二个是后端功能，但这是什么意思？在 Conv2D 中，我们发送过滤器的数量、过滤器的大小和步幅（Conv2D(64,(3,3),stride=(8,8))(input))，但在 conv2d 中，我们使用 conv2d(input, kernel, stride=(8,8)) 它是什么内核（64,3,3），我们将过滤器的数量和大小放在一起？我应该输入内核数吗？你能帮我解决这个问题吗？谢谢。

pytorch 中的代码

def apply_conv(self, image, filter_type: str):

        if filter_type == 'dct':
            filters = self.dct_conv_weights
        elif filter_type == 'idct':
            filters = self.idct_conv_weights
        else:
            raise('Unknown filter_type value.')

        image_conv_channels = []
        for channel in range(image.shape[1]):
            image_yuv_ch = image[:, channel, :, :].unsqueeze_(1)
            image_conv = F.conv2d(image_yuv_ch, filters, stride=8)
            image_conv = image_conv.permute(0, 2, 3, 1)
            image_conv = image_conv.view(image_conv.shape[0], image_conv.shape[1], image_conv.shape[2], 8, 8)
            image_conv = image_conv.permute(0, 1, 3, 2, 4)
            image_conv = image_conv.contiguous().view(image_conv.shape[0],
                                                  image_conv.shape[1]*image_conv.shape[2],
                                                  image_conv.shape[3]*image_conv.shape[4])

            image_conv.unsqueeze_(1)

            # image_conv = F.conv2d()
            image_conv_channels.append(image_conv)

        image_conv_stacked = torch.cat(image_conv_channels, dim=1)

        return image_conv_stacked

Keras 中更改的代码

def apply_conv(self, image, filter_type: str):

        if filter_type == 'dct':
            filters = self.dct_conv_weights
        elif filter_type == 'idct':
            filters = self.idct_conv_weights
        else:
            raise('Unknown filter_type value.')
        print(image.shape)

        image_conv_channels = []
        for channel in range(image.shape[1]):
            print(image.shape)
            print(channel)
            image_yuv_ch = K.expand_dims(image[:, channel, :, :],1)
            print( image_yuv_ch.shape)
            print(filters.shape)
            image_conv = Kr.backend.conv2d(image_yuv_ch,filters,strides=(8,8),data_format='channels_first')
           image_conv = Kr.backend.permute_dimensions(image_conv,(0, 2, 3, 1))
            image_conv = Kr.backend.reshape(image_conv,(image_conv.shape[0], image_conv.shape[1], image_conv.shape[2], 8, 8))
            image_conv =  Kr.backend.permute_dimensions(image_conv,(0, 1, 3, 2, 4))
            image_conv = Kr.backend.reshape(image_conv,(image_conv.shape[0],
                                                  image_conv.shape[1]*image_conv.shape[2],
                                                  image_conv.shape[3]*image_conv.shape[4]))

            Kr.backend.expand_dims(image_conv,1)

            # image_conv = F.conv2d()
            image_conv_channels.append(image_conv)

        image_conv_stacked = Kr.backend.concatenate(image_conv_channels, axis=1)

        return image_conv_stacked

但是当我执行代码时，它会产生以下错误：

Traceback（最近一次调用最后一次）：

文件“”，第 383 行，在 decoded_noise=JpegCompression()(act11)#16

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\keras\engine\base_layer.py", 第 457 行，在调用 output = self.call(inputs, **kwargs)

文件“”，第 169 行，调用中 image_dct = self.apply_conv(noised_image, 'dct')

文件“”，第 132 行，在 apply_conv image_conv = Kr.backend.conv2d(image_yuv_ch,filters,strides=(8,8),data_format='channels_first')

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\keras\backend\tensorflow_backend.py", 第 3650 行，在 conv2d 中 data_format=tf_data_format)

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\nn_ops.py", 第 779 行，在卷积中数据格式=数据格式）

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\nn_ops.py", 第 839 行，在 init 中 filter_shape[num_spatial_dims]))

ValueError: 输入通道数不匹配对应过滤器的维度，1 != 8

新代码

for channel in range(image.shape[1]):
            image_yuv_ch = K.expand_dims(image[:, channel, :, :],axis=1)
            image_yuv_ch = K.permute_dimensions(image_yuv_ch, (0, 2, 3, 1))
            image_conv = tf.keras.backend.conv2d(image_yuv_ch,kernel=filters,strides=(8,8),padding='same')
            image_conv = tf.keras.backend.reshape(image_conv,(image_conv.shape[0],image_conv.shape[1], image_conv.shape[2],8,8))

错误：

Traceback（最近一次调用最后一次）：

文件“”，第 263 行，在 decoded_noise=JpegCompression()(act11)#16

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\keras\engine\base_layer.py", 第 457 行，在调用 output = self.call(inputs, **kwargs)

文件“”，第 166 行，调用中 image_dct = self.apply_conv(noised_image, 'dct')

文件“”，第 128 行，在 apply_conv image_conv = tf.keras.backend.reshape(image_conv,(image_conv.shape[0],image_conv.shape[1], image_conv.shape[2],8,8))

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\keras\backend.py", 第 2281 行，重塑 return array_ops.reshape(x, shape)

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", 第 6482 行，重塑 “重塑”，张量=张量，形状=形状，名称=名称）

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\op_def_library.py", 第 513 行，在 _apply_op_helper 引发错误

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\op_def_library.py", 第 510 行，在 _apply_op_helper preferred_dtype=default_dtype)

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", 第 1146 行，internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\constant_op.py", 第 229 行，在 _constant_tensor_conversion_function 返回常量(v, dtype=dtype, name=name)

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\constant_op.py", 第 208 行，保持不变值，dtype=dtype，shape=shape，verify_shape=verify_shape))

文件 "D:\software\Anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\tensor_util.py", 第 531 行，在 make_tensor_proto 中 “支持的类型。” %（类型（值），值））

TypeError：无法将类型对象转换为张量。内容：（维度（无）、维度（4）、维度（4）、8、8）。考虑将元素转换为支持的类型。

【问题讨论】：

见"Merge" versus "merge", what is the difference?。以小写开头的名称表示可以接收一个或多个张量和参数并产生另一个张量的函数。以大写开头的名称代表层，它们不直接接收和输入张量，而是产生一个可以接收张量并产生新张量的可调用对象。
谢谢。现在我有一个形状为 (:,1,32,32) 的张量和形状为 (64,1,8,8) 的过滤器，如果我使用 conv2d(image, filters)，是否有可能或者我们应该在过滤器和图像形状？我需要 Keras 考虑 64 个 8x8 过滤器，但我不确定当我使用 conv2d(image, filters) 时它会做同样的事情吗？你能帮帮我吗
如果你已经有一个图像张量和一个过滤器张量，那么使用tf.nn.conv2d。使用 Keras 函数，您只需指定过滤器的大小，然后 Keras 会在内部为您创建它们。无论如何，您的数据似乎不是默认格式（我想图像是(batch, channels, height, width) 和过滤器(out_channes, in_channels, height, width)？）。请参阅函数中的data_format 参数，如果需要，请使用tf.transpose。
是的，图像形状是 (batch, 3,32,32)，现在我需要用我制作的特殊过滤器对图像进行卷积，然后它们是 64 过滤器 8x8，我必须将它们与图片。我该怎么办？是否可以将过滤器发送到 conv2d？
对不起，我使用 Keras，所以我应该使用 keras.backend.conv2d 而不是 tf.nn.conv2d？我在 pytorch 中有一个代码，我需要将其更改为 Keras。在 pytorch 代码中，过滤器的大小首先是 (64,8,8)，然后是挤压 (1)，所以我认为大小变成了 (64,1,8,8,)。因此，我说过滤器尺寸是（64,1,8,8）。我添加上面的代码，我将其更改为 Keras

标签： python tensorflow keras

【解决方案1】：

Tensorflow 和 Keras 现在使用 channel_last 约定。所以首先你应该使用K.permute_dimension 将通道调暗到最后。您可以在 colab.research.google.com 中尝试此代码来弄清楚自己。

第一个问题：

conv2d 是执行二维卷积的函数docs
keras.layers.Conv2D() 将返回一个执行卷积功能的 Conv2D 类的实例。查看更多here

# The second 
import keras
conv_layer = keras.layers.Conv2D(filters=64, kernel_size=8, strides=(4, 4), padding='same')

基本上，它们的定义方式和使用方式不同。当conv_layer 对某些输入x（例如conv_layer）应用卷积时，K.conv2d 在keras.layers.Conv2D 内部使用。

下面的例子可以帮助你更容易理解say_hello和SayHello的区别。

def say_hello(word, name):
    print(word, name)


class SayHello():

    def __init__(self, word='Hello'):
        self.word = word
        pass

    def __call__(self, name):
        say_hello(self.word, name)


say_hello('Hello', 'Nadia') #Hello Nadia

sayhello = SayHello(word='Hello') # you will get an instance `sayhello` from class SayHello

sayhello('Nadia') # Hello Nadia

第二个问题：

kernel 这里是一个形状张量 (kernel_size, kernel_size, in_channels, out_channels)
如果你想得到形状为 (8, 8, 64) 的image_conv，那么strides=(4,4)。

import tensorflow as tf
import tensorflow.keras.backend as K

image = tf.random_normal((10,3, 32, 32))
print(image.shape) # shape=(10, 3, 32, 32)

channel = 1
image_yuv_ch = K.expand_dims(image[:, channel,:,:], axis=1) # shape=(10, 1, 32, 32)
image_yuv_ch = K.permute_dimensions(image_yuv_ch, (0, 2, 3, 1)) # shape=(10, 32, 32, 1)

# The first K.conv2d
in_channels = 1
out_channels = 64 # same as filters
kernel = tf.random_normal((8, 8, in_channels, out_channels)) # shape=(8, 8, 1, 64)

image_conv = tf.keras.backend.conv2d(image_yuv_ch, kernel=kernel, strides=(4, 4), padding='same')
print(image_conv.shape) #shape=(10, 8, 8, 64)


# The second 
import keras
conv_layer = keras.layers.Conv2D(filters=64, kernel_size=8, strides=(4, 4), padding='same')
image_conv = conv_layer(image_yuv_ch)
print(image_conv.shape) #shape=(10, 8, 8, 64)

【讨论】：

非常感谢大卫。我根据您的建议更改了代码库，但是当我想重塑 image_conve 时，它会产生我上面提到的错误。你知道为什么会这样吗？
这是另一个问题。您收到错误，因为新形状与旧形状不一致。我无法将 (10, 2) 重塑为 (100,5) 但可以将其重塑为 (2,5,2) :) 你应该打印旧的并找出合适的新形状。
当我用一些测试数据测试代码时，比如 image=K.random.randint(4,shape=(2,32,32,1)) 和 filters= K.random.randint(4 ,shape(8,8,1,64)) 它运行良好并且不会产生这个错误但是当我尝试测试我的网络并将网络层的输出张量发送到这个函数时，它会产生这个错误:((( （（（
您的代码在我看来太复杂了 :) 您应该简化它以首先帮助自己了解幕后的内容 :)
好的，我找到了：）当我使用（image_conv.shape[0],image_conv.shape[1],image_conv.shape[2],8,8）时，它产生了这个错误，因为前三种类型是 Dimension 和 8 ，当我将形状更改为 (-1,4,4,8,8) 时，8 是简单的数字，错误就解决了。