Making a computer to understand cartoons
A model that can classify tom and jerry from Warner's brother animation
For Who?
Let's say you are bored and want something fun to do in your free time...
And you are in the right place
As the title says
we are going to build an AI model that can identify whether it's seeing tom or jerry
~ Like This :
Let's jump right in
📝 if you want to follow along with the code just click me (づ ̄ 3 ̄)づ
Loading Data
For us to teach a computer who is Tom or jerry we need a bunch of pictures of tom and jerry first
Now the easiest way we can get a lot of pictures is Google, but it's going to be a little tedious to do it from scratch...
So here comes our angel
Changing the image data into numbers
Let's change our images into numbers so our AI model can learn their patterns
If you don't know what I'm talking about .. no problemo just click me (〃` 3′〃)
Code :
import tensorflow as tf
train_df= tf.keras.utils.image_dataset_from_directory(dir,color_mode="rgb",
validation_split=0.2,
seed=42,
subset="training",
image_size=(224,224))
valid_df = tf.keras.preprocessing.image_dataset_from_directory(dir,image_size=(224,224),
validation_split=0.2,seed=42,
subset="validation",color_mode="rgb")
train_df,valid_df
Visualize Some Images
Let's Create an AI model
Code :
base_model = tf.keras.Sequential([
tf.keras.layers.Conv2D(10,5,activation="relu",input_shape=(224,224,3)),
tf.keras.layers.Conv2D(10,5,activation="relu"),
tf.keras.layers.MaxPool2D(pool_size=2,padding='valid'),
tf.keras.layers.Conv2D(10,kernel_size=5,activation="relu"),
tf.keras.layers.Conv2D(10,kernel_size=5,activation="relu"),
tf.keras.layers.MaxPool2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(100,activation="relu"),
tf.keras.layers.Dense(len(class_names),activation="softmax")])
base_model.compile(loss="sparse_categorical_crossentropy",
optimizer = tf.keras.optimizers.Adam(),
metrics=["accuracy"])
base_model_history = base_model.fit(train_data,epochs=3,steps_per_epoch=len(train_data),
validation_data=valid_data,validation_steps=len(valid_data))
It's a base model that we build which is only learned our images without any knowledge of what a face might look like
So for solving this problem we are gonna use a pre-learned model which as the name describes .. already knows what a face will look like ... like eyes, ears, nose ... so on...
How can we use them with a help of Transfer Learning
Now if you have no idea of what it is click me (@^0^@)/
Final Model
Code :
base_model = tf.keras.applications.EfficientNetB0(include_top=False)
base_model.trainable=False
In = tf.keras.layers.Input(shape=(224,224,3))
Data_Aug = data_aug(In)
x = base_model(Data_Aug)
pool = tf.keras.layers.GlobalAveragePooling2D()(x)
output = tf.keras.layers.Dense(len(class_names),activation="softmax")(pool)
model_1 = tf.keras.Model(In,output)
model_1.summary()
Output :
Now As you can see you need to focus on two main things
Number of layers
And how many layers is trainable
And how many layers are non-trainable
Now let's unfreeze some of the pre-learned layers so we can use them to train our own image
Code :
model_1.layers[1].trainable = True
for layer in base_model.layers[:-10]:
layer.trainable=False
Now let's train our model:
Code :
model_1.compile(loss = "sparse_categorical_crossentropy",
optimizer="adam",
metrics="accuracy")
model_1_history = model_1.fit(train_data,epochs=5,steps_per_epoch=len(train_data),
validation_data=valid_data,validation_steps=len(valid_data))
Prediction (final round)
Let's check our final model predictions
Conclusion🔥
As we can see our AI model is really 🤯 good at finding out whether a given photo contains tom or jerry or even both
With just a small amount of work🐱🐉 ... just imagine if you want to program this whole thing in the traditional programming method ...
it's gonna take a really long long long... time📝😑
But with a help of AI, we don't really want to do any hard work...🥳 it's mostly taken care of by our wonderful Computers💻
Well anyways thank you for your time ☜(⌒▽⌒)☞
All the Code: Click me
Linkedin: ◑﹏◐