TOPIC-23 | Multiclass classification using Neural Network | Live Session
AI Free Basic Course | Lecture 23 | Multiclass classification using
Neural Network | Live Session
· The
mnist
module provides the MNIST dataset, which is a collection of 60,000 28x28
grayscale images of handwritten digits, along with a test set of 10,000 images.
MNIST dataset is very good dataset for
starters, upon which we can practice building neural networks. There are lot of
other network available which can also be used.
This type of dataset is used to train neural
networks which are used to recognize the digits on the bank cheques.
·
The Sequential
module is used to create a linear stack of layers, which is the most common way
to build a neural network in Keras.
·
The Dense
module creates a fully-connected layer, which is a layer where each neuron is
connected to every neuron in the previous layer.
·
The to_categorical
module converts a label vector to a one-hot encoded vector, which is required
by the Dense
layer.
As the name suggests, before classification we have to convert the
data into a special form of Yes and No. However the scenario here is different,
instead of two classes, we have 10 classes here.
# splitting the data
into test and train set
(X_train, Y_train),
(X_test, Y_test) = mnist.load_data()
|
Let us start our work from the basic dataset
·
The line (X_train, Y_train),
(X_test, Y_test) = mnist.load_data() splits the MNIST dataset into a training set and a
test set.
·
The mnist.load_data() function downloads the
MNIST dataset if it is not already in your computer. It then splits the dataset
into a training set and a test set, with 60,000 images in the training set and
10,000 images in the test set.
·
The X_train and X_test variables contain the
images of the digits, and the Y_train and Y_test variables contain the
labels of the digits. The images are stored as 28x28 NumPy arrays, and the
labels are stored as integers from 0 to 9.
·
We have 10 classes from which we are going to
identify the different digits, whether it is 2, 8 9 etc
·
The mnist.load_data() function has two optional
arguments:
- shuffle: If set to True, the images and labels will be shuffled
before splitting. This is useful to prevent the model from learning the
order of the images.
- seed: A random seed
that can be used to control the shuffling. This is useful for
reproducibility.
In this case, we are not
setting any of the optional arguments, so the images and labels will be
shuffled by default.
import matplotlib.pyplot as plt
# Number of digits to
display
n = 10
# Create a figure to
display the images
plt.figure(figsize=(20, 4))
# Loop through the first
'n' images
for i in range(n):
# Create a subplot within
the figure
ax =
plt.subplot(2, n, i + 1)
# Display the original
image
plt.imshow(X_test[i].reshape(28, 28))
# Set colormap to grayscale
plt.gray()
# Hide x-axis and y-axis
labels and ticks
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# Show the figure with the
images
plt.show()
# Close the figure
- Line
1: Imports the matplotlib.pyplot module.
- Line
2: Defines the variable n to store the number of digits to
display.
- Line
3: Creates a figure to display the images.
- Line
4: Starts a loop to iterate over the first n images in the X_test array.
- Line
5: Creates a subplot within the figure for the current image.
- Line
6: Displays the original image in the subplot.
- Line
7: Sets the colormap to grayscale.
- Line
8: Hides the x-axis and y-axis labels and ticks.
- Line
9: Ends the loop.
- Line
10: Shows the figure with the images.
- Line
11: Closes the figure.
It was told in the last lecture that image has a
size. The image given above is of 28x 28 pixe size. If we multiply the height and
width of pixels, we get the total number of pixels in the picture. It is not
possible that you take the whole image and hand over it to neuron as it can’t
be passed to neuron in this way. So, we adopt a strategy here, what we do? We
place the pixels of all image in a straight one-dimensional row.
Look at the diagram above:
We have image of 1 here. Each box in the 2 D
Matrix (having rows and columns) is representing 1 x pixel. Now we are
converting 2-Dimensional image to 1 dimensional one column/row/ array. The Process
is called flattening the image.
Before the reshaping, we print the shapes of the original
training data and labels using the code:
print("Previous
X_train shape: {} \nPrevious Y_train shape:{}".format(X_train.shape,
Y_train.shape))
In our
training data above, we have 60000 images having size of 28 x 28 pixels. While
in target variable data that is y-train, it comprises only 60,000 images which
shows that it comprises one dimensional images only.
The code
provided appears to be part of a process where you're reshaping the training
and testing data to a flat format. The images are being transformed from a 2D
shape (usually representing height, width, and color channels) to a 1D shape
(flattened array of pixel values). This is a common preprocessing step before
feeding data into machine learning models like neural networks.
We are reshaping the training and testing data using the code:
X_train = X_train.reshape(60000,
784)
X_test = X_test.reshape(10000,
784)
Here,
you're reshaping the X_train and X_test arrays so that each image is represented as a
flattened array of length 784 (28x28 pixels). The number of rows in these
reshaped arrays should match the number of samples you have in the dataset.
After
this reshaping, the data is ready to be fed into a machine learning model that
accepts flattened input features. Make sure to adjust your model architecture
accordingly to handle the flattened input shape.
Now we have flattened the data using
reshape function. After it the flattened data would be fed into the first layer
of the neural network. The number of neurons in the first row of the neural
network must be equal to the number of values being fed into it. For example,
the value or pixels being fed to it are 784 so the number of neurons in the
first row must also be 784. After flattening the image and feed it into the
first layer of neural network, there is another challenge that is identification
of images.
Min-Max Scalling
In our data set we have different
digits and each digit has its own shape. We cant define the curves of the digit
as per our own requirement. Everyone has its own choice to writhe each digit.
In order to meet this challenge, we must have next level knowledge of boundaries
of digits in our data set. The boundaries must be clearer. The boundaries in
our image range between 0 to 255. 0 represents black and 255 represents white.
As our some part of our image would
be pure black some part would be pure white and some part of the image would be
between black and white, that we can say represents in grey shade. In order to
make these grey shades in the image clear we use scaling technique between 0
and 1. This will convert the black dominant part of grey into pure black and
white dominant part of grey into pure white.
In the given image all 0 presenting
the white part of the image but the pixels that are close to 255 are in black
shade and the pixels at the edges are close to white are in midtone of grey.
We want that the edges of the images
at the corners became clear. We do scaling for this purpose.
Before scaling we convert it into
float 32 (decimals) format to improve our performance.
# Convert the data type of
the images to float32
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
In the provided
code, we are converting the data type of the image data in X_train
and X_test
arrays to float32
. This type
conversion is a common preprocessing step in machine learning to ensure that
the data is in the appropriate format for numerical calculations.
The astype
function is
used to change the data type of the arrays. By converting the data type to float32
, you're
ensuring that the pixel values of the images are represented as floating-point
numbers, which is the standard format for numerical computations in many
machine learning frameworks.
This type conversion is
important because some machine learning algorithms and neural network models
require input data to be in a specific data type for accurate computations and
training. float32
is a
commonly used data type for this purpose as it provides a good balance between
precision and memory usage.
# Normalize the pixel
values to a range between 0 and 1 # Zero is for Black #1 for White
X_train /= 255
X_test /= 255
In the provided code, you
are normalizing the pixel values of the image data in the X_train and X_test arrays to a range between
0 and 1. This normalization step is another important preprocessing technique
often used when working with image data in machine learning.
|
Normalizing the data helps in preventing issues
related to varying scales of input features. It can also improve the
convergence speed and stability of training processes, especially when using
optimization algorithms like gradient descent.
Keep in mind that normalization is a crucial step,
especially when dealing with neural networks, as it can have a significant
impact on the model's performance and training dynamics.
Processing the
Target variable
As we know that our variables are 10 and in
our target variable there can be any digit between 0 and 10. Now we have to convert our variables in the
form of 0 and 1 to make it understandable to the model. Now challenge is that
we have now 10 number of classes instead of 2 number of classes that were easy
to classify into 0 and 1 format. In order to overcome this challenge, first of
all have a glance of this pictorial representation.
Suppose we have a image that is 0 digit image.
The first column of the row against it will define it as vector 1 and the
remaining values would be 0. So in our target variable there would have 10 vectors
for every image one value out of which would be 1 . This location of the value
would tell about the identity of the digit.
For example it the location of 1 is under columns
2, it means the image would be 2
In order to have this type of
classification we have imported the categorical module above.
# Number of classes in the
dataset
classes = 10
# Convert the labels to
one-hot encoded format
Y_train =
to_categorical(Y_train, classes)
Y_test =
to_categorical(Y_test, classes)
# Print the shapes of the
preprocessed training data and labels
print("New X_train shape: {} \nNew Y_train shape:{}".format(X_train.shape,
Y_train.shape))
We are converting the labels of the dataset into
one-hot encoded format. One-hot encoding is a common technique used to
represent categorical variables (such as class labels) in a format that is
suitable for machine learning algorithms.
In this code:
classes = 10 |
We 've
defined the variable classes to store the number of classes in our dataset (which is 10). |
Y_train =
to_categorical(Y_train, classes) Y_test =
to_categorical(Y_test, classes) |
We simply tell the module that this is our target variable say Y-train
and this is our classes variable. This will convert it inot 0 and 1 form. This
process of converting is called on-hot coding. In which 1 is representing the
desired digit. We 've
used the function to_categorical to convert the class labels in both Y_train
and Y_test arrays into one-hot encoded format. One-hot
encoding represents each class label as a binary vector where a '1' appears
at the index corresponding to the class and '0' in other positions. This is
often used when dealing with multi-class classification problems. |
print("New X_train shape:
{} \nNew Y_train shape:{}".format(X_train.shape,
Y_train.shape)) |
Finally, we are printing the shapes of
the pre-processed training data and the one-hot encoded labels to verify the
changes. |
New X_train shape: (60000,
784)
New Y_train shape:(60000,
10)
New X_train shape: (60000,
784)
We have converted our train data
into one dimensional array which have 60,000 images of size 784 pixels or
classes.
New Y_train shape:(60000, 10)
It's important to note that the shape of our trarget variable Y_train that was only 60,000 previously after flattening; will now be (num_samples, num_classes) where num_samples is the number of training samples, and num_classes is the number of classes
(which is 10 in our case). The same applies to
the shape of Y_test.
One-hot encoding is
commonly used for classification tasks and is essential when training models
like neural networks for multi-class classification, as it helps the model
interpret class labels correctly during training.
This is the required shape of our
data.
Setting up
Hyper-parameters
As discussed earlier, hyper parameters are
adjusted before training and without considering the dataset. We set their
values based on our experience.
# Define the input size for
each data sample (e.g., image pixels)
input_size = 784
# Specify the number of
data samples to process in each batch
batch_size = 200
# Define the number of
neurons in the first hidden layer
hidden1 = 400
# Define the number of
neurons in the second hidden layer
hidden2 = 20
# Define the total number
of classes/categories in the dataset
classes = 10
# Set the number of
complete passes through the dataset during training
epochs = 5
We are defining various parameters and
hyperparameters that will be used for training a neural network model. Here's a
breakdown of each parameter:
1.
input_size
: This
parameter represents the number of features in each data sample. In our context, it corresponds to the number of pixels in a flattened
image (28x28 = 784) since we flattened our images earlier.
2. batch_size
: This
parameter determines the number of data samples that are processed in each
iteration during training. It's a key factor in controlling memory usage and
training efficiency. In our case, each training
iteration will process 200 samples at a time.
01 x Epoch will be completed after iterating 60,000 images one time.
3. hidden1
: This
parameter specifies the number of neurons in the first hidden layer of your
neural network. The hidden layers are where the actual learning happens as the
model extracts features and representations from the input data.
4. hidden2
: This
parameter specifies the number of neurons in the second hidden layer. The
architecture you're defining has two hidden layers: one with 400 neurons and
the second with 20 neurons.
5. classes
: This
parameter indicates the total number of classes or categories in your dataset.
In your case, you're working with a dataset containing 10 classes.
6. epochs
: This
parameter defines the number of complete passes through the entire training
dataset during training. Each epoch consists of multiple iterations
(mini-batches) where the model is updated based on the training data. In your
case, you're training the model for 5 epochs.
These parameters are
critical for configuring our neural network
architecture and controlling the training process. The specific values you've
chosen will affect the behaviour of your neural network and its performance on our task.
The practice we so far have made , does not contribute anything
towards building the architecture of neural network. We so far have observed
data, done preprocessing of data and have set the hyper parameters.
Now we will define architecture of neural network.
Building the
FCN Model
Before we proceed further, be aware that there are many types of
neural networks. The neural network we are discussing here is one the fundamentals
of neural network, called standard or fully connected neural network as it was
built first of all. The architecture we are displaying here was built in year
1988.
# Create a Sequential
model, which allows us to build a neural network layer by layer
model = Sequential()
# Add the first hidden
layer with 'hidden1' neurons, using ReLU activation function
# The 'input_dim' specifies
the input size for this layer
model.add(Dense(hidden1,
input_dim=input_size, activation='relu'))
# output = relu(dot(W,
input) + bias)
# Add the second hidden
layer with 'hidden2' neurons, also using ReLU activation function
model.add(Dense(hidden2,
activation='relu'))
# Add the output layer with
'classes' neurons, using softmax activation function
# Softmax activation
ensures that the output values represent probabilities of each class
model.add(Dense(classes,
activation='softmax'))
### Compilation ###
# Compile the model by
specifying the loss function, optimizer, and evaluation metrics
model.compile(loss='categorical_crossentropy',
metrics=['accuracy'],
optimizer='sgd')
# Display a summary of the
model architecture, showing the layers and parameter counts
model.summary()
We
are building and compiling a neural network model using the Keras API. Here's a
breakdown of each step:
In this part of the code:
model = Sequential() |
We are creating a Sequential model.
This type of model allows you to build a neural network layer by layer,
sequentially adding one layer after another. We can understand the sequential from the daily life example. Suppose
Sequential is the box in which we are stacking different layers in sequence.
First of all input layer is stacked then first hidden layer then second
hidden layer and out put layer at the last. We see that these layers don’t have any connection between them. It
is the specialty of this sequential box whenever it is required to establish
the connection, it will built the connection automatically by using the single
command of tensor flow. So in the sequential box we just stack the all layers and on running compiler command of tensor flow connection between all layers is
established. We have named the sequential as model as after running the compiler
a model would have been built and would be ready to perform the rest of
tasks. We will fit, train and evaluate that model and make prediction on the
basis of this. We can say that tensor flow gives us a sequential box, in which
we stack neural network layers and on running the compiler this sequential
box is converted to the neural network. |
model.add(Dense(hidden1,
input_dim=input_size, activation='relu')) # output = relu(dot(W,
input) + bias) |
As discussed above that
there are many types of neural networks. The neural network we are discussing
is fully connected neural network. So when we add the layer in our model that
is fully connected is called Dense layer. So the Dense
layer represents a fully connected layer where each neuron is connected to
every neuron in the previous layer. The container Sequential is
the big box, which contains another box in which layers are stacked. Now we have to place the neurons in layer
box. The type of nurons to be placed depend upon the type of layers. If the layers
are hidden then Rectified
Linear Unit (ReLU) neurons would be placed. If the layer is output layer, then we will
first check whether the problem is of regression or classification and
neurons then would be placed accordingly. We are adding the first hidden
layer using the Dense layer. hidden1 specifies the number of
neurons in this layera that are 400,
and the input_dim parameter is set to input_size,
which is the number of features in each input sample (784 in our case). The activation
function used for this layer is the Rectified Linear Unit (ReLU) activation. |
So until now we have placed 400 neurons on
first dense layer in our sequential container.
model.add(Dense(hidden2,
activation='relu')) |
We are adding
the second hidden layer using another Dense layer. This layer
contains hidden2 neurons and also uses the ReLU activation
function. |
model.add(Dense(classes,
activation='softmax')) |
We are adding the output layer
using a final Dense layer. This layer has classes
neurons, which corresponds to the number of classes in your dataset. The
activation function used here is the softmax activation function, which
converts the output values into probabilities, ensuring that they sum up to
1. |
|
It must be remembered that softmax is the group of
neurons while ReLU and Sigmoid are single neurons. |
So until now we have placed 400 neurons on
first dense layer and 20 neurons on second dense layer and softmax neurons as
per number of classes on output layer in our sequential container. We so far have
not connected these layers.
model.compile (loss='categorical_crossentropy',
metrics=['accuracy'], optimizer='sgd') |
After building the model architecture, we proceed to the compilation step. At this stage the model
is given with the two parameters. First is loss Function and other is metrics
function. |
loss='categorical_crossentropy':
|
First parameter would help to determine whether neural network is
working accurately or not. Loss function is used for this purpose. This is the loss function
used for training the model. It's appropriate for multi-class classification
tasks. This
model measures the distance between actual and prediction. If the distance between
them is more the more the loss would be and vice versa.
|
metrics=['accuracy']: |
This specifies the evaluation metric you want to track during
training. Here, you're using accuracy. |
optimizer='sgd': |
Now in order to minimize
the loss we will use gradient descent that is sgd. sgd stands for Stochastic Gradient Descent. Remember that whenever
the optimizer is used, there would always presence of gradient descent,
although with different or added features. This specifies the
optimization algorithm to use during training. |
Finally, you display a
summary of the model's architecture and parameter counts using model.summary().Your model is now defined,
compiled, and ready for training!
Following summary would be generated after running the above code:
· There are 400 neurons in first
hidden layer.
· There are 20 neurons in second
hidden layer.
· There are 10 neurons in output
layer.
· There are 322,230 parameters. It means
322,230 parameters/numbers must be updated after which the model will learn. At
this stage after running the desired number of epochs the value of loss would
have reduced to minimum.
As it was told earlier that the number of neurons in our first
layer must be equal to the values in the 1 Dimensional flattend. That were 784.
So, we have defined it in our code as follows:
model.add(Dense(hidden1, input_dim=input_size,
activation='relu'))
# output = relu(dot(W,
input) + bias)
It means input layer is contain
784 neurons, first hidden layer contains 400 neurons, second hidden layer
contains 20 neurons and output layer contains 10 neurons.
Parameters
Calculations
Remember these are Parameters not hyper parameters. Parameters are continuously updated by gradient descent and
through learning by observing the data.
output = relu(dot(W, input) + bias)
(400*784)
+ 400 = 314000 = 0.3 million paramters
·
784 neurons
were fed into our input layer.
·
400 neurons
were fed into our first hidden layer.
·
As in connected/
Dense neural network neurons in each layer are connected with the neurons in
previous layer, so we will connect the each of 400 neurons in hidden layers
with each of 784 neurons of the input layer. The total number of neurons
connected can be found by multiplying (400*784).
·
400 more
biased neurons are added with the previous neurons. This make it easy for
learning the model.
parameters for Chat-gpt 4 = 1760000000000 = 1.76 trillion
parameters
Training The
Model
# Import necessary
libraries
from time import time
# Record the current time
to measure training time
tic = time()
# Fit the model on the
training data
model.fit(X_train, Y_train,
batch_size=batch_size, epochs=epochs, verbose=1)
# Record the time after
model training
toc = time()
# Calculate and print the
time taken for model training
print("Model training took {} secs".format(toc -
tic))
# Testing the trained model
### 5. Test
# You can continue your
code from here...
In
the provided code, we are fitting the compiled model to the training data,
measuring the time it takes for training, and preparing to test the trained
model on the testing data.
In this part of the code:
1. You import the time
function
from the time
module to
measure the training time.
2. You record the current time using tic
before
starting the training of the model.
3. You use the fit
method to
train the model on the training data (X_train
and Y_train
). The
parameters used in fit
are batch_size
(200), epochs
(5), and verbose
(1). The verbose
parameter
controls the verbosity level during training. A value of 1 means progress
updates will be printed for each epoch.
4. You record the time after the model training is complete using toc
.
5. You calculate and print the time taken for model training by
subtracting tic
from toc
.
The remaining part of
your code indicates that you're about to test the trained model on the testing
data. You can continue your code from this point to evaluate the model's
performance on the testing dataset and make predictions using the trained
model. If you have specific questions or tasks related to this testing phase, feel
free to provide more details, and I'll be happy to assist!
Testing The
Model
# Import the necessary
libraries from sklearn.metrics import accuracy_score import numpy as np import matplotlib.pyplot as plt
# Predict probabilities
for the test set using the trained model y_pred_probs =
model.predict(X_test, verbose=0) y_pred =
np.where(y_pred_probs > 0.5, 1, 0)
# Calculate and print the
test accuracy using predicted and true labels test_accuracy =
accuracy_score(y_pred, Y_test) print("\nTest accuracy:
{}".format(test_accuracy)) Test accuracy: 0.9089 |
# Import the necessary
libraries
from sklearn.metrics import accuracy_score
import numpy as np
import matplotlib.pyplot as plt
# Predict probabilities for
the test set using the trained model
y_pred_probs =
model.predict(X_test, verbose=0)
y_pred =
np.where(y_pred_probs > 0.5, 1, 0)
In this part of the code:
1. You
import the accuracy_score function from sklearn.metrics to compute the accuracy of
your model's predictions.
2. You
import numpy as np for array manipulation.
3. You
import matplotlib.pyplot as plt for potential visualization (although it seems you
haven't used it in this code snippet).
4. You use
the trained model (model) to predict probabilities for the test set (X_test) using the predict method. The verbose parameter is set to 0 to
suppress progress updates.
5. You
threshold the predicted probabilities to convert them into binary predictions.
Here, you're using a threshold of 0.5. If a predicted probability is greater
than 0.5, it's considered as class 1; otherwise, it's considered as class 0.
# Calculate and print the
test accuracy using predicted and true labels
test_accuracy =
accuracy_score(y_pred, Y_test)
print("\nTest accuracy: {}".format(test_accuracy))
In this part of the code:
1. You
calculate the test accuracy by comparing the predicted labels (y_pred) with the true labels (Y_test) using the accuracy_score function.
2. You
print the test accuracy.
This code snippet essentially evaluates the
performance of your trained model on the testing dataset by calculating the
test accuracy. The accuracy score is a common metric used to assess how well a
classification model performs on unseen data. It represents the ratio of
correctly predicted instances to the total number of instances in the test set.
# Define a mask for
selecting a range of indices (20 to 49)
mask = range(20, 50)
# Select the first 20
samples from the test set for visualization
X_valid = X_test[0:20]
actual_labels = Y_test[0:20]
In this part of the code:
1. You define a mask
variable
using the range
function.
This mask is a sequence of indices from 20 to 49. This could be used to select
a specific subset of data points from the test set.
2. You select the first 20 samples from the test set (X_test
) and assign
them to the variable X_valid
. These
samples will be used for visualization and prediction.
3. You also select the corresponding true labels for the selected
samples from the test set and store them in the actual_labels
variable.
# Predict probabilities for
the selected validation samples
y_pred_probs_valid =
model.predict(X_valid)
y_pred_valid =
np.where(y_pred_probs_valid > 0.5, 1, 0)
1/1 [==============================] - 0s 88ms/step
In this part of the code:
1. You use the trained model (model
) to predict
probabilities for the selected validation samples (X_valid
) using the predict
method.
2. You threshold the predicted probabilities to obtain binary
predictions for the selected validation samples. Similar to before, you're
using a threshold of 0.5. If a predicted probability is greater than 0.5, it's
considered as class 1; otherwise, it's considered as class 0.
At this point, you have
predicted labels for a subset of the test data, and you can proceed to
visualize the results or perform any further analysis you have in mind. If you
have more code related to visualization or analysis, feel free to share it, and
I can assist you with that as well.
# Set up a figure to
display images
n = len(X_valid)
plt.figure(figsize=(20, 4))
for i in range(n):
# Display the original image
ax =
plt.subplot(2, n, i + 1)
plt.imshow(X_valid[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# Display the predicted digit
predicted_digit = np.argmax(y_pred_probs_valid[i])
ax =
plt.subplot(2, n, i + 1 + n)
plt.text(0.5, 0.5, str(predicted_digit),
fontsize=12, ha='center', va='center')
plt.axis('off')
# Show the plotted images
plt.show()
# Close the plot
plt.close()
In
this part of the code, we are setting up a figure to display original images
from the validation subset and their corresponding predicted digits.
In this part of the code:
- You set the variable n to the number of samples in your
validation subset.
- You use plt.figure(figsize=(20,
4)) to set up a
figure for plotting the images. This specifies the size of the figure.
- You iterate through each sample in the
validation subset (X_valid), displaying the original image and the
corresponding predicted digit.
- For the original image:
- You use plt.subplot(2,
n, i + 1)
to create a subplot for the original image.
- You display the image using plt.imshow after reshaping it to a 28x28 format.
- You set the color map to grayscale using plt.gray() to display the image in grayscale.
- You use ax.get_xaxis().set_visible(False) and ax.get_yaxis().set_visible(False) to hide the axes.
- For the predicted digit:
- You calculate the predicted digit using np.argmax(y_pred_probs_valid[i]), which gets the index of the maximum
predicted probability.
- You create a subplot for the predicted digit
using plt.subplot(2, n, i + 1 + n).
- You use plt.text to display the predicted digit as
text in the center of the subplot. The ha and va parameters are set to 'center' for
horizontal and vertical alignment, respectively.
- You use plt.axis('off') to turn off the axes.
- After plotting all images, you use plt.show() to display the plotted images.
- Finally, you use plt.close() to close the plot.
This code snippet creates a
visualization of the original images along with the predicted digits for each
image. It's a useful way to visually inspect how well your model is performing
on a subset of the validation data.
Suppose input image of 9 is given to the model to predict. The
softmax has generated the result as 9.
Comments
Post a Comment