Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[README Enhancement]: Advertisement Click Prediction #628

Merged
merged 1 commit into from
Jun 5, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 80 additions & 47 deletions Advertisement Click Prediction/Model/README.md
Original file line number Diff line number Diff line change
@@ -1,48 +1,81 @@
### Project Title
Advertisement Click Prediction
### Aim
Predict the clicks on the advertisement depending on different attributes and user inputs of the dataset.
### Dataset
https://www.kaggle.com/jahnveenarang/cvdcvd-vd
### Approach
Initially Exploratory Data Analysis and Data Visulaization is performed on the dataset. Then by applying various algorithms on the dataset, we are going to predict whether the user will click on the advertisement or not. Finally the accuracies of all algorithms are compared and found the best fitted model.
### Steps Involved
- All the necessary libraries are imported
- Performing EDA on the data to understand it
- Data Visualization to visualize the data and get meaningful insights
- Correlation of all features are found to understand the relationship between each feature
- Categorical features are converted into numerical features using feature mapping
- The dataset is split into training and test data and scaled
- Model Building:
We use four algorithms to build the models
- XGBoost Classifier
- Random Forest Classifier
- Gradient Boosting
- Multi Layer Perceptron
- After fitting these models, we analyze the confusion matrix and compare the accuracies of all algorithms.
### Data Visualization and Correlation

<img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/gender.png">
<img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/purchased.png">
<img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/age-purchased.png">
<img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/salary-purchased.png">
<img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/purchased-gender.png">
<img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/box-purchased-salary.png">
<img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/box-purchased-age.png">
<img src='https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/correlation.png'>

### Accuracies
- XGBoost Classifier - 92%
- RandomForest - 90%
- Gradient Boosting - 90%
- Multi-Layer Perceptron - 87%
<img src = 'https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/accuracy.png'>

### Language Used - Python
### Libraries Used - pandas, seaborn, numpy, matplotlib
### Conclusion
Among all the models, XGBoost Classifier model gave almost 92% accuracy and it is the best fitted model.
<hr>

Code contributed by SNEGA S
## **Advertisement Click Prediction**

### 🎯 **Goal**
The objective is to predict whether a user will click on an advertisement based on various attributes and user inputs from the dataset. By analyzing these features, the aim is to develop a model that accurately forecasts user behavior in response to advertisements.

### 🧵 **Dataset**
Link for the dataset used in the project: [`https://www.kaggle.com/jahnveenarang/cvdcvd-vd`](https://www.kaggle.com/jahnveenarang/cvdcvd-vd)

### 🧾 **Description**
We start with *Exploratory Data Analysis (EDA)* and *Data Visualization* to gain insights from the dataset. Then, we apply various machine learning algorithms to predict whether a user will click on an advertisement. Finally, we compare the accuracies of these algorithms to identify the best-performing model.

### 🧮 **What I had done!**
- Imported essential libraries for data manipulation and machine learning.
- Conducted Exploratory Data Analysis (EDA) to comprehend the dataset.
- Visualized data to extract meaningful patterns and insights.
- Assessed feature correlations to understand interdependencies.
- Converted categorical features into numerical formats via feature mapping.
- Split the dataset into training and testing sets and applied scaling techniques.
- Implemented and trained four machine learning models: **XGBoost**, **Random Forest**, **Gradient Boosting**, and **Multi-Layer Perceptron**.
- Evaluated the models using confusion matrices and compared their accuracies to determine the best-performing model.

### 🚀 **Models Implemented**
Model Building: We implemented the following algorithms for their distinct advantages in handling various aspects of the dataset:

- XGBoost Classifier: Known for its high performance and efficiency in handling large datasets with complex patterns.
- Random Forest Classifier: Effective in reducing overfitting and providing reliable feature importance insights.
- Gradient Boosting: Powerful for capturing intricate data relationships and improving accuracy through boosting techniques.
- Multi-Layer Perceptron: Capable of capturing non-linear relationships due to its deep learning architecture.

### 📚 **Libraries Needed**
- Language Used
- Python
- Libraries Used
- Pandas
- Seaborn
- Numpy
- Matplotlib

### 📊 **Exploratory Data Analysis Results**
<table>
<tr>
<td><img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/gender.png"></td>
<td><img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/purchased.png"></td>
</tr>
<tr>
<td><img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/age-purchased.png"></td>
<td><img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/salary-purchased.png"></td>
</tr>
<tr>
<td><img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/box-purchased-salary.png"></td>
<td><img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/box-purchased-age.png"></td>
</tr>
<tr>
<td><img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/purchased-gender.png"></td>
<td><img width=70% src='https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/correlation.png'></td>
</tr>
</table>

### 📈 **Performance of the Models based on the Accuracy Scores**
<table>
<tr>
<td style="padding-right: 20px; vertical-align: top;">
<ul style="list-style-type: disc; margin: 0;">
<li>XGBoost Classifier - 92%</li>
<li>RandomForest - 90%</li>
<li>Gradient Boosting - 90%</li>
<li>Multi-Layer Perceptron - 87%</li>
</ul>
</td>
<td style="vertical-align: top;">
<img src="https://github.com/snega16/ML-Crate/blob/snega16/Advertisement%20Click%20Prediction/Images/accuracy.png" alt="Description of image" style="max-width: 200px; max-height: 200px;">
</td>
</tr>
</table>


### 📢 **Conclusion**
Among all the models tested, the **XGBoost Classifier** achieved the highest accuracy, approximately **92%**, making it the best-performing model for predicting advertisement clicks. This demonstrates its effectiveness in handling the dataset and providing reliable predictions.

### ✒️ **Your Signature**
Created by [Suraj Kashyap](https://github.com/imsuraj675) as a part of SSOC'24.
Loading