Application of CNN in image processing.

24-07-2022
chuong xuan
0 Comments

Mục lục

Features of images in Computer.

Color image: RGB stands for red (red), green (green), blue (blue), which are the three main colors of light when separated from the prism. When mixing the above three colors in certain proportions, different colors can be formed.

Application of CNN in image processing. — Adding red to green produces gold; adding yellow to blue produces white. wiki source.

For each set of 3 integers r, g, b in the range [0, 255] will produce a different color. Because there are 256 ways to choose r, 256 ways to choose color g, 256 ways to choose b => the total number of colors that can be created using the RGB color system is: 256 * 256 * 256 = 16777216 colors !!! Just heard that storing the data of a photo alone is difficult, not to mention processing.

Which according to the Neural Network model

If each hidden layer is called fully connected and as outlined above, surely the number of parameters is very large. Thus, we can use Convolution to solve the problem of a large number of parameters and still calculate the features of the image.

Gray picture

I'm sure all of you who study TOEIC will look at these pictures quite familiar. Then a gray image we only need to represent by an integer value in the range [0,255] instead of (r,g,b) as in color image. Therefore, when representing gray images in a computer, only one matrix is enough.

Convolution magic

To make it easier to imagine, I will take an example on a gray image, that is, the image is represented as a matrix A of size m * n.

We define the kernel as a square matrix of size k*k where k is an odd number. k can be equal to 1, 3, 5, 7, 9,… For example kernel size 3*3

Notation for convolution calculation (⊗), symbol Y = X ⊗ W

For each element x _ij in the matrix X, get a matrix of size equal to the size of the kernel W with the element x _ij centered (this is why the size of the kernel is often odd) called the matrix. A. Then sum the elements of the element-wise calculation of matrix A and matrix W, and then write in the resulting matrix Y.

And matrix Y is smaller in size than matrix X. The size of matrix Y is (m-k+1) * (n-k+1).

Padding

Simply put, Padding is what surrounds a matrix, can be zero padding or one padding, etc. It helps to solve the problem when matrix Y has the same size as matrix X.

Stride

Instead of having to perform sequentially, left and right for less than 1 step, we can customize stride to increase or decrease the jump. However, if stride = k (k > 1), then we only perform convolution on elements X _{1+ i ∗ k} ,1+ j k

To summarize, we have the following general formula for determining the feature map:

For a matrix X of size m*n with a kernel of size k*k, stride = s, padding = p.

Meaning of convolution.

Surely everyone has used photo software once, in pts supports a lot of photo editing functions. They mostly use CNNs. Thus, it can be seen that the convolution operations help to edit the image, turning the input image into another image, by Kerels.