In this project, we have implemented a support vector classier to solve a binary class classication task. The task is distinguishing between images of handwritten digits from the MNIST database. Each sample is a 28x28 gray scale image, associated with a label. The project contains 3 different implementations grouped as Q1, Q2 and Q2 as show in the repository structure
**SVM_from_scratch**
│ README.md
│ file001.txt
| data.py
| train-images-idx3-ubyte
| train-labels-idx1-ubyte
└───Q1
│ │ SVM.py
│ │ run_1.py
|
└───Q2
│ │ SVM_decompose.py
│ │ run_2.py
|
└───Q3
│ │ MVP.py
│ │ run_3.py
In this implementation , we will only consider numbers the images and labels of numbers 3 and 8. To solve this task, we used "SLSQP" constraint optimizer from scipy library which solves the dual soft margin svm quadratic optimization problem with a linear/rbf/polynomial kernel. We have splitted data into train/test with fixed seed and 80%/20% proportions.
Here we have implemented the SVM light decomposition method. SVM light is an iterative method where in each step a sub set of data points are selected (which is called the working set) and SLSQP algorithm solves the quadratic optimization problem for the subset. Here the crucial part is selection of the subset at each iteration and the stopping criteria of the decomposition algorithm. The SVM light algorithm suggests choosing q/2 indices from and q/2 indices from where q cardinality of the working set and and are shown in formulas bellow.
. Based on that, the selection rule is defined as,
The ideal stopping criteria is satisfaction of the KKT conditions shown bellow.
But generally it takes a lot of iterations to satisfy this inequality exactly and this extremely increases the computation time. To avoid this, we let the violation of the KKT condition up to some small threshold which is our tolerance hyper-parameter equal to 0.0001
Here we have implemented the Sequential Minimal Optimization (SMO) algorithm to solve the same soft margin svm quadratic problem with an iterative method. SMO is a decomposition method. The value of q is fixed to 2, all the time. Fixing q=2 makes the sub problem a convex quadratic problem of two variables which can be solved analytically. In the selection of these 2 indices, we use the Most Violating (MVP) pairs. Two indices and are MVP pairs, if and only if equations bellow are satisfied and KKT condition not satisfied.
=======