Sri ram_654
still learning...

still learning...

Making a computer understand images💻

Making a computer understand images💻

making a basic image classification model with neural networks

Sri ram_654's photo
Sri ram_654
·Jul 22, 2022·

7 min read

Featured on Hashnode

Subscribe to my newsletter and never miss my upcoming articles

Play this article

Table of contents

What are Images in the digital world Anyway?


A bunch of Red, Blue, and Green lights

Now this is what we call pixels

So now the questions are :

  • how many pixels do I have on my device?
  • why only use 3 colors to represent an image?
  • why do I even have to know this?

feel free to skip these questions if you know the answers 🐱‍🏍

let's go through them one by one

How many pixels do I have on my device? 🐓

it's actually not that complicated to find out

if your screen's resolution is HD i.e 1280×720 In which 1280 denotes the pixels along the taller side of the screen i.e length and 720 denotes the pixels along the shorter side of the screen i.e width

Then find the product of the resolution

1280×720=921600 pixels

And wow first of all, did you see that!! Maths is fun when you know why you do it in the first place

And Another wow for the number of pixels that we have on our devices 📱(which is, in this case, almost close to 1 million pixels wow)

The important thing to notice is it's just for 720p display which is nowadays a low budget device😂

Why Use only three colors to represent an image? 🔴🟢🔵

In a single word

The red, blue, and green lights can make all other colors in our visible spectrum

By just combing the color we can get most of the colors that we are using in our day-to-day life

If you want to dive in more on this topic @howstuffwork

Why do you even have to know this? 🐱‍👓🤷‍♂️

For understanding the problem we are going to cover in the upcoming sections

And also it's so interesting to know how the technology we use day-to-day in our life works

The thing to note is:

This is how I approach a problem🐣

"tend to approach things from a physics framework"

you boil things down to the most fundamental truths you can imagine🌌 … and then reason up from there🙌

After reading this line it gives you a basic understanding of how to learn? which itself Big topic🤷‍♂️

The key takeaway from all these questions are:

  • now you know how to calculate the number of pixels in a given screen with just the resolution
  • The Basic use of RGB color's in our pixels

Now let's get into the most fun part 🐱‍👤(AI)

andrea-de-santis-zwd435-ewb4-unsplash (1).jpg

Teaching AI To learn patterns from the Basic images and predict them when new data comes in

We are going to cover a lot of things in this blog so let's don't waste any time and jump right into it 🦘

First thing to understand is what is a multiclass classification problem?

A really great question ...

What is classification in the first place? @what an ai can learn from 1 and 0's in this blog I wrote about binary classification.. which is one type of classification (this or that, on or off, smoker or non-smoker)Which is 2 class


Building upon that ... the word "multiclass" will describe itself (Multiple classes to predict on)More than 2 classes sharon-mccutcheon-o2glCCYUCe8-unsplash.jpg

Now let's go through the workflow we are going to cover in this problem

The things we are going to cover:

  • how to load our image into the environment we are going to train our model? 🔃
  • splitting our data into two parts training and testing 🪓
  • Normalize the Data 0️⃣1️⃣
  • Create our Nemo AI 🤖(our ai model name)
  • Plotting the confusion matrix📊
  • Predicting an image with our Nemo🤖

Don't worry if you don't know most of the things 👀, I can say 90 % sure that you will understand all the things I mentioned above after going through this blog

Loading the Data


First and foremost we need data to teach/train our model(Which is an image for this problem)

We are going to make use of TensorFlow.Keras.datasets library

What is Tensorflow ? ...

TensorFlow is a free and open-source software library for machine learning and artificial intelligence

Next question .. what is Keras? ...

Keras is a high-level neural network library that runs on top of TensorFlow

In that library we are going to make utilize the datasets module which gives a dataset for our problem

Code :

#Loading the data from tensorflow.keras
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist

What is Fashion_mnist ?..

Damn your firing all the good questions into this blog 🎯 Which is a Greate thing to do 🙌

Fashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples.

splitting the data 🪓


Our Nemo AI Needs to perform on the two different sets of data

Training Set: which is used to Train the Nemo(model)

Testing Set: which is used to Test our Nemo(model)

Code :

(x_train,x_label),(y_train,y_label) = fashion_mnist.load_data()

Here x_train,x_label is training data and y_train,y_label is test data

Classes We are going to Train our Nemo(AI model)

Classes are nothing but dress and shoe categories like i.e : "Coat","Sandal","Bag",..etc...


code :

class_names = ["T-shirt_or_top","Trouser","Pullover","Dress","Coat","Sandal","Shirt","Sneaker","Bag","Ankle_boot"]

Visualizing the images

code :

import matplotlib.pyplot as plt
index = 6
random = np.random.randint(0,1000)
for i in range(index):
  random = np.random.randint(0,1000)
  ax = plt.subplot(4,3,i+1)

With a help of matplotlib and numpy library we can see what the images look like

Screenshot 2022-07-22 193212.png

Here we can see what our images look like

Remember its only 6 images out of a training set of 60,000

our Nemo(model)🤖 is going to learn from 60,000 images just like this

let's normalize the data

what is normalization ?.... 🐓

Assume that you have data of x and y

x is one 1️⃣ and zero 0️⃣ whereas y is 10,000 to 20,000... here if we used this data to train our model ...

maybe our model is going to get a biased thought that may be higher the number higher the priority which is not good for our data which is a bunch of pixel values ranging from (0 to 255)

Images are stored in the form of a matrix of numbers in a computer where these numbers are known as pixel values.

These pixel values represent the intensity of each pixel.

0 represents black and 255 represents white.

article about this

So let's change our range 0 to 255 into0 to 1

code :

x_train_norm = x_train/255
y_train_norm = y_train/255

Creating our Nemo(AI) 🤖


Nemo = tf.keras.Sequential([tf.keras.layers.Flatten(input_shape=(28,28)),

history =,x_label,epochs=20,validation_split=0.2)

With a help of TensorFlow.Keras Sequential API, we can create a dense neural network model

The above code👨‍💻 is literally the brain of our AI 🤖

Which is going to understand the pattern between all the images and use them to predict the unknown images in the same classes

Screenshot 2022-07-22 200304.png

After Training our Nemo model reached 89% accuracy in the training images

Our model confusion matrix

Screenshot 2022-07-22 200328.png

Here this is what we call the confusion matrix...what are the things my model confused to predict

Screenshot 2022-07-22 201103.png

For eg: we can see our Nemo is mostly confused with shirts and t-shirts

Where y is the actual value/classes and x is the predicted value/classes

The contrasted diagonal line is are what our Nemo model got right and Its actually right

Which is mostly correct for all the classes

predicting with Nemo model

Screenshot 2022-07-22 201723.png

As you can see our model is correct with a 100% Confidence level except for the one on the Down left corner ...

our model accuracy is 89% which means out of 10, 8 times it predicts the correct class to the correct image

Which is great for our fun project🐿

To summarize :

We just created an AI Model(Nemo🤖) which can tell the image whether it's a dress or shoe or handbag and so on...

In other words, it can now classify the images into multiple categories

The Machine Learning field is a wonderful thing 🔥 ...

we taught the AI to see the image and predict it into the right categories it belongs to...

Which is not possible like 50 years ago 🤯.

Just think about that ... 🐓

We covered so many things on this blog all at once if you can't understand 😵...don't worry.

I may Explain this even in-depth in the future

Thanks for reading :)

Bye ☜(⌒▽⌒)☞

Share this