This repository contains a machine learning pipeline for classifying the quality of sensorless drives based on various features. The pipeline is implemented in Python using the pandas, numpy, sklearn, and matplotlib libraries.
The data for this project is stored in the file Datasets/Sensorless_drive_diagnosis.txt. It contains X number of features for Y number of instances, with a class label for each instance indicating the quality of the sensorless drive.
The pipeline performs the following steps:
- Read in the data from the file and do some basic data exploration (e.g. generate summary statistics, check for class balance, get information about the dataframe).
- Split the data into training and testing sets, using 80% of the data for training and 20% for testing.
- Train a decision tree classifier and use it to select important features from the training data.
- Standardize the data using the training data.
- Train and test a decision tree classifier, a multi-layer perceptron classifier, and a random forest classifier using the training and testing sets.
- Plot the performance of the models using a confusion matrix and generate a classification report for each model.
- Save the trained models to files using pickle.
To run this pipeline, you will need the following libraries: pandas numpy sklearn matplotlib In addition, the VisualizeNN_mod is provided in this repository and must be placed in the same directory as the other code files.
The directory structure for this repository is as follows:
Sensorless Drive Diagnosis: ├── classification_main_v1.py ├── Datasets │ ├── Sensorless_drive_diagnosis.txt └── VisualizeNN_mod.py
To run the pipeline, simply run the script classification_pipeline.py using Python. The trained models will be saved to the Models directory.
The performance of the models will be printed to the console, including the confusion matrix and classification report for each model. The models will also be saved to the Models directory.
The classification models were implemented using scikit-learn version 0.23.1. The VisualizeNN_mod module was modified from the original version by Milo Harper (https://github.com/miloharper/visualise-neural-network) for use in this project.