AI Free Basic Course | Lecture 27 | Building AutoEncoder

 

AI Free Basic Course | Lecture 27 | Building AutoEncoder

Look at the image above, it is the image of simple vanilla autoencoder network. Whenever the word Vanilla is used, it means simplest autoencoder.

When the image is passed to the encoder it performs two jobs, first it extracts the feature of the image and second is reducing the size of the image. The image is further compressed, and its dimensions are reduced up to code, Bottleneck or Latent representation. It then is passed to the Decoder where the reverse process starts, and the compressed image is expanded, and an image is regenerated.

We also discussed in our last lecture the concept of denoising. Where we were giving the input image and were generating another image as an output. Suppose we add noise into our image by drawing lines over it and give it to the autoencoder. Now the burden over the autoencoder has increased. Now it has not only to extract the features of the image but also has to struggle with noise separately. It means when the model is giving a noisy image, it puts more efforts and learns the features in better way. Autoencoder will not be able to detect the noise and will generate the output image with noise, if it does not put extra effort to learn the features of the image.

In order to have command over coding, there is a need to understand the process upon which we are going to apply the code. Which steps we are going to follow needs to be understood first.

For example:

1. we will first select the dataset

2. Then will upload that dataset.

3. Then will feed it to the encoder as input.

4. We will define encoder, bottleneck and decoder.

5. At decoder stage, we will follow the exactly reverse steps we took at encoder stage.

6. Next, we have to choose whether we should use ANN or CNN at encoder or decoder architecture.

As we have seen that ANN cannot learn the feature of normal / cleaned image better than CNN. When we use noisy data in autoencoder, there is requirement from CNN to learn the feature of noisy image more precisely to handle the noise. It means in case we use ANN in autoencoder its performance would not be so satisfactory. It will try to generate the images, but that images would not be as good as these would be generated in case of CNN. So we will try to make the architecture of Autoencoder through CNN. During the process of making the architecture, we would pass an image having size of 28 x 28 to our first input layer.

At input layer we will apply convolution layer, fix the number of filters and determine the size of the kernel which will slide over the input image. We have to determine the activation function. After convolution layer, pooling layer would be applied.

We will repeat this practice of applying convolution layer, number of filters and decide size of kernel and then pooling layer until reach at the bottleneck where the compressed image would be flattened. Information at the bottleneck would be the output of the Encoder.

At the decoder, we will reshape this flattened image for passing it to convolution layer.

The convolution layer would extract the features from the compressed image. At encoder stage we used Max Pooling after convolutionary layer for reducing the size, so at decoder stage we must use such layer which should increase the size of image. The layer we use for increasing the size of image is called up sampling. The process would now be repeated by applying the up-sampling layer after convolutionary layer until we reach the output layer, where our image would be regenerated.

Now we have familiarized ourselves with the process of encoding and decoding and type of layers to be added at encoding and decoding stages and sequence with which these layers would be added.

After this we just have to decide which function, we are going to use from Keras Tensor flow library.

Autoencoder is a unique type of architecture. The special thing about it is that when we use convolutionary layer, it becomes the convolutionary autoencoder. Similarly, in case the input to processed is time series, and we use a recurrent layer, it becomes a recurrent autoencoder. So, it is architecture that can be used as per our requirement to learn a representation. We have now got an idea that the decoder regenerates the image from highly compressed data. Autoencoder in denoising creates actual image through latent representation. So the desired information which was required by a model was available in that compressed representation, enabling that model to regenerate the actual image.

So the concept of representation would come across again and again in CNN . As in Fully connected Neural Network, the second last layer used to create a representation and on the basis of this representation a neuron in the last layer classify the input. So representation is the language which is understandable by the computer.

The concept of representation could be understood from the daily life example. Suppose we want to give the concept about the car to a child. One way is to make a speech to the child about the features of car, that how many tyres it has, how many headlights steering and seats are there in the car. The child in this case in spite of efforts made about the feature of car would not be able to get the idea of car in his mind. The other approach is to show the child the image of a car. The child in this case would immediately understand about the car.

Similarly, the computers would learn differently on different representations. Some representations like speech about car above are very difficult for the computer to understand, and some representations are very easy for the computer to understand like image of car. In the same way, the computer gets confused about the output it is supposed to produce, if a large quantity of data is given to it as an input. However, when it is represented with the compressed data after applying the dimensionality reduction, it understands the data and is able to produce the desired results easily.



Comments

Popular posts from this blog

Topic-18 | Evaluation Metrics for different model

Topic 22 | Neural Networks |

Topic 17 | Linear regression using Sklearn