Understanding convolution in tensorflow
Points should be noticed
The convolution ops sweep a 2-D filter over a batch of images, applying the filter to each window of each image of the appropriate size. The different ops trade off between generic vs. specific filters:
conv2d: Arbitrary filters that can mix channels together.
depthwise_conv2d: Filters that operate on each channel independently.
separable_conv2d: A depthwise spatial filter followed by a pointwise
filter.
from:
Neural Network | TensorFlow
https://www.tensorflow.org/api_guides/python/nn#Convolution
Explanation of the equation in API of tf.nn.conv2d
b in yellow shadow is for travering through the batch with images.
the variables in green shadow is for locating a patch.
In the equation, di and dj loop variable to traverse through the height and width of the patch in one image.
~~I’m not very sure about this. q is for traversing through the channels of the image, with the step of strike[3]. ~~
ref:
tf.nn.conv2d | TensorFlow
https://www.tensorflow.org/api_docs/python/tf/nn/conv2d
tf.nn.conv[1d, 2d, 3d]
There are tf.nn.conv1d, tf.nn.conv2d and tf.nn.conv3d. Problems are there, which to be choosed and what’s the difference.
I have not studied this deep. For I am using tensorflow to processing images, it seems tf.nn.conv2d is for me. From my perspective, the difference of them are as following.
-
tf.nn.conv1dis for some linear structure, text, sound or some others. -
tf.nn.conv2dis for structure appeared to be 2-D, such as image. -
tf.nn.conv3dis for something not familiar to me. It seems to be applied in signal processing.