We all know how smoking is bad for our health 🚭 Right? But How bad is it? And can we understand who is a smoker and who is not a smoker by just looking at the person's body informational data?

Smoking Problem

Problems With Smoking

let's take a look at 4 significant problems it can cause to our bodies🚬:

coronary heart disease(Coronary heart disease is a type of heart disease where the arteries of the heart cannot deliver enough oxygen-rich blood to the heart).
heart attack.
stroke.
peripheral vascular disease (damaged blood vessels)
cerebrovascular disease (damaged arteries that supply blood to your brain)

And More...

So yeah as we can see it affects most of the important parts of our body like 💓🧠.

It's a big deal if we can help in some way to solve this issue or just Identify who is smoking and not smoking quickly by just seeing their body functions

Now how can we do that?

AI intro 👨‍💻

photo by yuyeung

Why do we need to use computers to solve this problem... and why not humans?

You see computers are faster than humans so why not utilize that.

But the catch is we are teaching the computer to learn the pattern by itself...

it's really weird when you hear this first time. how can you teach a computer to learn by itself?..right.

But as it turned out you can teach a computer to learn anything in this world ... I MEAN ANYTHING !!

Then the next question will be HOW?

In one word

By using the Data

Now what is data ?... great question

its nothing but a collection of information for eg :

Online Tracking (GPS collecting data from you)
Social Media Monitoring

And so on

Well let's say you have data like this:

Here there are two dotted lines (yellow and purple)

What if I tell you to divide this data into two categories ...

Well it's easy as you can see we can just do this

And there you go you just did a Binary classification

But a slight difference is you did it with Your Brain

~ But how can we teach a computer to do this? And What is Binary Classification?

Wow, I think you are being fired with lots of questions in your brain right now...

That's a good thing... Let's Jump Right in...🦘 :

Binary Classification 0️⃣1️⃣

Do you ever decided one thing over another

That's what Binary Decision would look like and also feels like

Its nothing but 1s and 0s This or That, On or Off, Non-smoker or Smoker

But in the machine learning field its called Binary Classification Categorizing Two things

What would it look like if I teach the machine to solve this problem

model predicted the classification.png

And Another Example:

Screenshot 2022-07-18 174130.png

Model Prediction:

Screenshot 2022-07-18 174151.png

As you can see my model can classify this data easily With a help of Neural Network

Code :

import tensorflow as tf
model_1 = tf.keras.Sequential([tf.keras.layers.Dense(10,activation="relu"),
                               tf.keras.layers.Dense(10,activation="relu"),
                               tf.keras.layers.Dense(1,activation="sigmoid")])

model_1.compile(loss="BinaryCrossentropy",
                optimizer="adam",
                metrics="accuracy")

history_1 = model_1.fit(x,y,epochs=50,validation_split=0.2)

In Image

Screenshot 2022-07-18 175022.png

It may look complicated if you are seeing this for the first time

Trust me it's easier when you understand the fundamentals of our code

Elon Musk Quote

"I tend to approach things from a physics framework"

you boil things down to the most fundamental truths … and then reason up from there

If you want to learn more about the code and what's happening behind the scene ...just let me know🐱‍🚀

Now let's solve our problem :

We need to find whether a person is smoking or not by using BinaryClassification

Lets Find the Smoker 🐱‍👤

The Features(The data we are going to take into consideration ) that we going to use is:

Screenshot 2022-07-18 180901.png

Most of the things we already know of... But the other unknown features are:

HDL: cholesterol type
LDL: cholesterol type
Triglycerides: it's the main constituents of body fat in human
serum creatinine: The amount of creatinine in your blood should be relatively stable. An increased level of creatinine may be a sign of poor kidney function
AST: glutamic oxaloacetic transaminase type
ALT: glutamic oxaloacetic transaminase type
GTP: Guanosine-5'-triphosphate is a purine nucleoside triphosphate. It is one of the building blocks needed for the synthesis of RNA during the transcription process
oral: Oral Examination status

Now we have somewhat important features in finding whether a person is smoking or not

Let's build a model and name him Robin

Code :

from sklearn.ensemble import RandomForestClassifier
Robin= RandomForestClassifier(n_estimators=2000)
Robin.fit(x_train_pre,y_train)
Robin.score(x_train_pre,y_train)

Output : 1.0

And there we go that's it that's our Robin model

And he has an accuracy of 1.0 (Which 100%) on our training data

To finalize our model accuracy let's check this on data that it had never seen before Code:

model_2.score(x_test_pre,y_test)

Output: 0.8286201633898914

It identified 10 out of 8 times correctly whether a person is a smoker or non-smoker

Which is really good (not great but good).

And let's see what are the features our model used while its training process:

Screenshot 2022-07-18 184401.png

As you can see it used most of our features to make an assumption.

This is What a BinaryClassification model looks like in a nutshell.

I used the word nutshell becoz I shortened a lot of processes that go behind this

Let me know if you want to know more about the coding part and what is happening in the background

Bye TakeCare (☞ﾟヮﾟ)☞

The above code is available on my Github@justclickhere👆

My Linkedin profile:@justclickhere👆

What if we use AI to predict the housing price @justclickhere

What an AI can learn from 1 and 0 🤖

Table of contents