AI Free Basic Course | Lecture 26 | AutoEncoders | Live Session
Press release of Upwork has been pinned on our social media platforms.
Upwork is the biggest online earning platform. If you want to take the
opportunity to earn from the million-dollar projects, to reinvest them to
generate and share idea, for all these Upwork is playing key role for promotion
of all these.
When started this online learning program as we have predicted the
direction of online earning. We have apprehended what the world is demanding in
the field of online earning and what our generation is doing. We are still
focusing on Canva, WordPress etc. Although it is not bad thing to work in these
areas, but to work in the limited scope does not mean we don’t have to pursue
for the growth, especially when the trends are changing. It is our duty to plan
and hard work and leave the rest to ALLAH for reward of hard work.
If we don’t adapt ourselves to the changing demands of the world and we
fail in the end. What if, if we don’t qualify for the demanded projects and
could not get work. Who is to blame then? Our fate or ourselves!
According
to press release of upwork the Top 10 generative AI-related searches from companies, January 1 - June
30, 2023: are given below:
- ChatGPT
- BERT
- Stable Diffusion
- Tensor Flow
- AI Chatbot
- Generative AI
- Image Processing
- PyTorch
- Natural Language
Processing (NLP)
- Bard
At the top is ChatGPT. What in GPT is being demanded? There are lot of
things in ChatGPT. It can be content writing, coding, Barts etc. Wherever the Barts are being used, its link
with the ChatGPT is essential. Bart has different layers which will be taught
in coming lectures.
BART, which stands for Bidirectional
and Auto-Regressive Transformers, is a language
model developed by Facebook. BART is known for its ability to generate
high-quality natural language text, as well as its ability to perform well on a
range of natural language processing tasks.
BERT is at second. Stable Diffusion is at third. Realizing its importance,
we have given sufficient time, at least three days to the Stable Diffusion.
Tensorflow which was also given due importance in our lectures is at fourth
position.
Chat bot at number 5 is topic of our tomorrow lecture. Generative AI at
position 6 is the combination of all 5 discussed above. At position 7 is image
processing. Do you remember our pink elephant on the road?
So, we have planned our lessons keeping in view AI-related searches from companies.
We taught neural network along with Convolutional Neural Network (ConvNet/CNN. Image Processing has also been discussed in lectures
related to hugging face. PyTorch is part of our coming lectures. NLP along with
its complete pipeline has also been discussed in ChatGPT and hugging face.
Bard is at position 10. We have stated our online
lectures from Bard. Additional lectures were also uploaded on the topic.
Moreover, I am going to claim that the queries that
will be included in generative AI-related searches
from companies in future are also part of our study program. So there is need
to realize the importance and get knowledge of these concepts. If you fail to
do this, what will happen eventually? Depression and anxiety will creep in our
lives due to not getting the projects. We will be hopeless and give up at the
end. We then look for the cheap shortcuts and as result of which crimes evolve
in the society. All this is the result of non-planning not apprehending not
foreseeing and not managing things at proper time.
All the discussion until now shows that we are
going on track and our study plan is to the point. This is the answer to those
question constantly being asked during the lectures that why we have chosen
this course outline. As we are in the industry and have planned the study plan
accordingly.
Now we move towards our topic.
Today we are discussing Auto
Encoders. Auto encoder is the type of neural network.
Artificial
Intelligence encircles a wide range of technologies and techniques that enable
computer systems to solve problems like Data Compression which is used in
computer vision, computer networks, computer architecture, and many other
fields. Autoencoders are unsupervised neural networks that
use machine learning to do this compression for us.
Auto Encoder on the lighter note is
the grandfather of ChatGPT and BERT. As all these models are descendants of
Auto Encoder. Let us understand it how?
Auto Encoder structure is developed
on two parts. One part is called Encoder and the second part is called Decoder.
BERT is collection of Encoders. When transformers of encoders are linked
together and learn representations from them. There are lot of encoders working
together. So that’s is why I am saying that BERT is actually a descendent of
Auto Encoder. Similarly, when we utilize ChatGPT, there are lot of Decoders
working together.
Let us understand this phenomenon
from the daily life example. Suppose a special guest arrives at hotel and
manager is assigned to take care of him. The guest is asked to provide the list
of all food items he needed at breakfast lunch and dinner so that arrangements
could be made accordingly. What happened here? Large amount of data was given
to the manager by the guest to perform certain action. The detailed
instructions were given on the day first.
What happens next day, the guest
just asks the same list of food items be followed. The large input data has
been reduced in this case, however the output would remain same as was on the
day first in response to the detailed input of instructions.
But remember on day 2, only that
person would understand the short input who was already given the detailed
input on day 1.
Similarly, if we have large image.
Instead of giving to the model this image that comprises large numbers
(pixels), we encode it in small part and give it to the model to generate the
same output that would have been produced if we have given the model the large
image without encoding. This would increase the speed of processing; less
computation power resultantly increases in the overall efficiency.
In our childhood while drawing a sketch by looking at image, we learn
the prominent features of that image and draw the image that looks like
original image. Now if we want computer to perform this similar job, the
computer must be provided with the neural network with compressed input
enabling it to draw the image from the minimal features provided to it. In
absence of neural network, the computer would not be able to draw that image
from the image provided to it as input as much computational power would be
required to do that.
Minimum information that is essential to generate the image from input
image below which the model would not be able to do that.
An autoencoder neural network is an Unsupervised Machine learning algorithm that
applies backpropagation, setting the target values to be equal to the inputs.
Autoencoders are used to reduce the size of our inputs into a smaller
representation. If anyone needs the original data, they can reconstruct it from
the compressed data.
We have a similar machine
learning algorithm ie. PCA which does the same task. Principal Component Analysis
(PCA) is one of the most commonly used unsupervised machine
learning algorithms across a variety of applications: exploratory data
analysis, dimensionality reduction, information compression, data de-noising,
and plenty more.
So, you might be thinking why
do we need Autoencoders then? Let’s find out the reason behind using
Autoencoders.
Emergence of Autoencoders:
Autoencoders are preferred over
PCA because:
- An autoencoder can learn non-linear transformations with
a non-linear
activation function and multiple layers.
- It doesn’t have to learn dense layers.
It can use convolutional
layers to learn which is better for video, image and
series data.
- It is more efficient to learn several
layers with an autoencoder rather than learn one huge transformation with
PCA.
- An autoencoder provides a representation
of each layer as the output.
- It can make use of pre-trained layers from
another model to apply transfer learning to enhance the encoder/decoder.
Examples of CNN mentioned above are of very basic level applications.
|
Look at the picture above, you can see the image segmentation. Through
CNI we have learnt how to learnt to classify and identify the image and detect
a specific object in it. A step forward we started to get information about
each and every pixel. Suppose I am looking for a car in the image, then the we
can differentiate only those pixels that related to the car from the whole
image. Similarly, we can differentiate
only those pixels that relate to road or tree specifically. of the image. So,
what we are doing here? We are segmenting or separating multiple objects from
an image based on their images. All this became possible due to CNN, that we
have learnt so far.
What is image Segmentation, it is detection and classification of images
learnt through CNN.
Look at the image above, we can see multiple images of cars, trucks and poles.
The model is identifying the objects as per their identity. It is identifying
cars as cars and truck as trucks. Even it identifies the poles exactly whether
they are traffic poles or Electric poles. The autonomous cars are the real-life
examples.
These cars identify the boundary of the road by segmentation the pixels of road from the whole image. These cards adjust their lanes by identifying the pixels of road lanes, poles of signal and surround vehicles. CNN has made us capable of doing all these tasks.
Similarly, the model identifies different hand gestures with the help of CNN.
The model is recognizing the faces of more than 5000 persons which can’t
be recognized by the human eye.
All this happened due to the progress the AI models made day by day.
Beside this there are other fields as well, like Medical Field. X-ray(s)
before AI were examined by doctors and diseases were diagnosed from scanning
the x-rays. Now the x-ray is passed to the model for prediction of different
health related problems. Similarly, CNN is being used in many other
applications of health.
After this a technology evolved about which we are much familiar now a
days. The technology is called generative AI. We generate image and text of our
own choice.
Let us think about the inception of Generative AI Model. At the start,
our aim was to generate an image from the neural network. We gave an image to
the neural network model and the neural network model examine, understands that
image and in return generates another image itself which is similar to the
input image. This was the basic step towards our journey towards generative AI.
The structure used in development of generative AI
model is called autoencoder.
Now let’s have a look at a few Industrial Applications of Autoencoders.
Image Coloring
Autoencoders are used for
converting any black and white picture into a colored image. Depending on what
is in the picture, it is possible to tell what the color should be.
Feature variation
It extracts only the required features
of an image and generates the output by removing any noise or unnecessary
interruption.
Dimensionality Reduction
The reconstructed image is the
same as our input but with reduced dimensions. It helps in providing the
similar image with a reduced pixel value.
Denoising Image
Let us generalized the idea. You
have seen lot of time that an old and torn image is shared on the Facebook and
asked to refine it. A photoshop expert comes forward and rectify the defects in
that old picture with the help of Photoshop tools and techniques.
Above stated job can be performed easily through Autoencoder. Similarly,
a black and white image can also be changed into color image through
autoencoder.
Now question arises, if the Photoshop can perform the same tasks of
rectification of the defects, then why AI? Before answering this question, we
must first understand the purpose of AI, that is to reduce the human
dependency. Everyone can use AI as tool to perform the same job which can be
performed through Photoshop by only those who are expert in it. So, we must
know that purpose of AI is to empower you and make you productive. AI is smart
enough to detect what is extract from the image and what not to give the clean
image as input.
In the above picture we have a noisy image, we encoded it, decoded it and got the clear picture at the end. It
actually regenerated the noisy image to make it a clear image. Applications of
autoencoder denoiser will astonish you. These are to clear the blur image, to
color the black and white image, clear the noisy image and countless other
applications in computer vision. Remember we are just discussing about computer
vision and not touched the its application in the field of languages
The input seen by the
autoencoder is not the raw input but a stochastically corrupted version. A
denoising autoencoder is thus trained to reconstruct the original
input from the noisy version.
Watermark Removal
It is also used for removing
watermarks from images or to remove any object while filming a video or a
movie.
Now
that you have an idea of the different industrial applications of Autoencoders,
understand the complex architecture of Autoencoders.
Remember input encoder is different
from pipeline, in pipeline there is sequence of commands and actions while in
input encoder there is single command.
ARCHITECTURE OF AUTOENCODERS
Autoencoder is an architecture which can be developed through
convolutionary neural network, fully connected neural network or recurrent
neural network or attention layer. So, we can use this architecture in many
applications according to our requirements.
As discussed, stable diffusion is actually the latent diffusion and it
is strength of autoencoder that it learns latent representation. It is like
conversion of 784 numbers into 10 number representation. Latent means hidden,
when we compress or encode the image it is called latent.
An Autoencoder consist of three
layers:
- Encoder
- Code
- Decoder
- Encoder: This
part of the network compresses the input into a latent space representation.
The encoder layer encodes the
input image as a compressed representation in a reduced dimension. The
compressed image is the distorted version of the original image.
- Code: This part
of the network represents the compressed input which is fed to the
decoder.
- Decoder: This
layer decodes the
encoded image back to the original dimension. The decoded image is
a lossy reconstruction of the original image and it is
reconstructed from the latent space representation.
Wherever the concept of Encoders and Decoders is used in computer
science it means the we compress the data in meaningful way and assign a key to
it so that nobody could understand it except the holder of that key. When the
person who has key reads that compressed data, the process of reading that data
is called decoding.
Our aim is to generate images and text. But before generating images or
text, We must have a model which we have trained to perform that job.
We can say that input encoder is
such type of neural network that predicts its own output. Now question arises
in presence of input, why there is need to predict output? Why this input is
not converted to output directly?
In the last lecture, we took the
image from amnist data and converted 28 x 28-pixel data into single layer
784-dimensional vector upon which model was trained.
Autoencoder has two parts, Encoder
and Decoder. In the same case the Encoder part of Autoencoder will convert that
single layer 784-dimensional vector into 10 parts. The function of Encoder is
to compress the larger information into less information. Our goal is not to
lose any information during the process of compression. So, nothing is lost in
the process.
The Decoder part again expands that
compressed part into 784 dimensional vectors. It means at the encoder part all
information of 784 numbers was encoded in ten number representation. This ten
number representation is sufficient to represent all those 784 numbers. The
decoder will again recreate 784 numbers from input of ten number
representation. So, we recreated the output without loss of any information.
During the process of recreation, we learnt a hidden representation which is
not only compressed but also not lost any information during the process of
compression. So, the larger image is compressed at encoder stage and then
transmitted to output layer which recreates the compressed image into larger
image.
Autoencoder is applied in
denoising, image classification, variational auto encoders and lot of other
application like deep learning neural network.
The task of researching the
applications of the autoencoders was assigned during the lecture.
The layer between the encoder
and decoder, ie. the code is also known as Bottleneck. This is a
well-designed approach to decide which aspects of observed data are relevant
information and what aspects can be discarded. It does this by balancing two
criteria:
- Compactness of representation, measured as the
compressibility.
- It retains some behaviourally
relevant variables from the input.
Convolution Autoencoders
Autoencoders in their traditional formulation does
not take into account the fact that a signal can be seen as a sum of other
signals. Convolutional Autoencoders use the convolution operator to exploit
this observation. They learn to encode the input in a set of simple signals and
then try to reconstruct the input from them, modify the geometry or the
reflectance of the image.
At our first layer at the left we ignore the
output and transmit the input to next layer after processing it. Similarly at
the next layer this output of previous layer is further processed as input and
transmitted to next layer until the encoder of the model is trained. At the
decoder layer we take care of output as per neural network.
What does mean processing the input data at
encoder stage? Processing here means extracting the features and reducing the
size of the input data.
At the end of encoder there is flattened or
one-dimensional vector layer of 1152 size.
A one-dimensional vector of 10 size also exists
after this layer which shows that image has been reduced to its minimum size.
Now this minimum size data is passed to next layer
which is the first layer of decoding stage. Now we see that it is working in
opposite way. Now FC layer is the first layer where as it was at the last in
encoding process.
We see at the end of decoder layer an output image
is generated which is similar to the input page. It has taken the features or
information from the input image and has generated another image which is
similar to the input image.
We can simplify it as from image to vector and
from vector to image.
Use cases of CAE:
- Image Reconstruction
- Image Colorization
- latent space clustering
- generating higher resolution images
Now that you have an idea of
the architecture of an Autoencoder. Let’s continue our Autoencoders
Tutorial and understand the different properties and the Hyperparameters
involved while training Autoencoders.
- Data-specific:
Autoencoders are only able to compress data similar to what they have been
trained on.
- Lossy: The
decompressed outputs will be degraded compared to the original inputs.
Hyperparameters of
Autoencoders:
There are 4 hyperparameters
that we need to set before training an autoencoder:
- Code size: It
represents the number of nodes in the middle layer. Smaller size results
in more compression.
- Number of layers: The
autoencoder can consist of as many layers as we want.
- Number of nodes per
layer: The number of nodes per layer decreases with each
subsequent layer of the encoder, and increases back in the decoder. The
decoder is symmetric to the encoder in terms of the layer structure.
- Loss function: We
either use mean squared error or binary cross-entropy. If the
input values are in the range [0, 1] then we typically use cross-entropy,
otherwise, we use the mean squared error.
Now that you know
the properties and hyperparameters involved in the training of Autoencoders.
Let’s move forward and understand the different types of autoencoders and how
they differ from each other.
Sparse Autoencoders
Sparse autoencoders offer us an alternative method
for introducing an information bottleneck without requiring a
reduction in the number of nodes at our hidden layers.
Instead, we’ll construct our loss function such that we penalize activations within
a layer.
Deep Autoencoders
The extension of the simple Autoencoder is
the Deep Autoencoder. The first layer of the Deep
Autoencoder is used for first-order features in the raw input. The second layer
is used for second-order features corresponding to patterns in the
appearance of first-order features. Deeper layers of the Deep Autoencoder tend
to learn even higher-order features.
A deep autoencoder is composed of
two, symmetrical deep-belief networks-
1. First four or five shallow layers representing the encoding half
of the net.
2. The second set of four or five layers that make up the decoding
half.
Use cases of Deep Autoencoders
- Image Search
- Data Compression
- Topic Modeling & Information Retrieval (IR)
Contractive Autoencoders
A contractive autoencoder is an
unsupervised deep learning technique that helps a neural
network encode unlabeled training data. This is accomplished by
constructing a loss term which penalizes
large derivatives of our hidden layer activations with
respect to the input training examples, essentially penalizing instances
where a small change in the input leads to a large change in the encoding
space.
Now that you have an idea of what Autoencoders is,
it’s different types and it’s properties. Let’s move ahead with our
Autoencoders Tutorial and understand a simple implementation of it using
TensorFlow in Python.
Comments
Post a Comment