AI Free Basic Course | Lecture 27 | Building AutoEncoder
AI Free Basic Course | Lecture 27 | Building AutoEncoder
Look at the image above, it
is the image of simple vanilla autoencoder network. Whenever the word Vanilla
is used, it means simplest autoencoder.
When the image is passed to
the encoder it performs two jobs, first it extracts the feature of the image and
second is reducing the size of the image. The image is further compressed, and its dimensions are reduced up to code, Bottleneck or Latent representation. It
then is passed to the Decoder where the reverse process starts, and the compressed image
is expanded, and an image is regenerated.
We also discussed in our last lecture the concept of denoising. Where we
were giving the input image and were generating another image as an output.
Suppose we add noise into our image by drawing lines over it and give it to the
autoencoder. Now the burden over the autoencoder has increased. Now it has not
only to extract the features of the image but also has to struggle with noise
separately. It means when the model is giving a noisy image, it puts more
efforts and learns the features in better way. Autoencoder will not be able to
detect the noise and will generate the output image with noise, if it does not
put extra effort to learn the features of the image.
In order to have command over coding, there is a need to understand the process
upon which we are going to apply the code. Which steps we are going to follow
needs to be understood first.
For example:
1. we will first select the
dataset
2. Then will upload that
dataset.
3. Then will feed it to the
encoder as input.
4. We will define encoder,
bottleneck and decoder.
5. At decoder stage, we will
follow the exactly reverse steps we took at encoder stage.
6. Next, we have to choose whether
we should use ANN or CNN at encoder or decoder architecture.
As we have seen that ANN cannot learn the feature of normal
/ cleaned image better than CNN. When we use noisy data in autoencoder, there is
requirement from CNN to learn the feature of noisy image more precisely to handle
the noise. It means in case we use ANN in autoencoder its performance would not
be so satisfactory. It will try to generate the images, but that images would
not be as good as these would be generated in case of CNN. So we will try to
make the architecture of Autoencoder through CNN. During the process of making the
architecture, we would pass an image having size of 28 x 28 to our first input
layer.
At input layer we will apply convolution layer, fix
the number of filters and determine the size of the kernel which will slide over
the input image. We have to determine the activation function. After
convolution layer, pooling layer would be applied.
We will repeat this practice of applying convolution
layer, number of filters and decide size of kernel and then pooling layer until
reach at the bottleneck where the compressed image would be flattened.
Information at the bottleneck would be the output of the Encoder.
At the decoder, we will reshape this flattened image for
passing it to convolution layer.
The convolution layer would extract the features from
the compressed image. At encoder stage we used Max Pooling after convolutionary
layer for reducing the size, so at decoder stage we must use such layer which should
increase the size of image. The layer we use for increasing the size of image
is called up sampling. The process would now be repeated by applying the
up-sampling layer after convolutionary layer until we reach the output layer,
where our image would be regenerated.
Now we have familiarized ourselves with the process of
encoding and decoding and type of layers to be added at encoding and decoding
stages and sequence with which these layers would be added.
After this we just have to decide which function, we
are going to use from Keras Tensor flow library.
Autoencoder is a unique type of architecture. The special thing about it is that when we use convolutionary layer, it becomes the convolutionary
autoencoder. Similarly, in case the input to processed is time series, and we
use a recurrent layer, it becomes a recurrent autoencoder. So, it is architecture
that can be used as per our requirement to learn a representation.
So the concept of representation would come across
again and again in CNN . As in Fully connected Neural Network, the second last
layer used to create a representation and on the basis of this representation a
neuron in the last layer classify the input. So representation is the language
which is understandable by the computer.
The concept of representation could be understood from the
daily life example. Suppose we want to give the concept about the car to a
child. One way is to make a speech to the child about the features of car, that
how many tyres it has, how many headlights steering and seats are there in the
car. The child in this case in spite of efforts made about the feature of car
would not be able to get the idea of car in his mind. The other approach is to show
the child the image of a car. The child in this case would immediately understand
about the car.
Similarly, the computers would learn differently on
different representations. Some representations like speech about car above are
very difficult for the computer to understand, and some representations are very
easy for the computer to understand like image of car. In the same way, the
computer gets confused about the output it is supposed to produce, if a large quantity
of data is given to it as an input. However, when it is represented with the
compressed data after applying the dimensionality reduction, it understands
the data and is able to produce the desired results easily.
Comments
Post a Comment