This repository contains code and data to reproduce the results of our research project about Self-Admitted Technical Debt in Pull Requests for the course Evidence-Based Software Engineering.
$ pip install -U requests lizard scipy scikit-learn
$ cd Preprocessing
$ python preprocess.py
$ cp new.csv ../QualitativeAnalysis/Round1/sampling_input.csv
$ cd ../QualitativeAnalysis/Round<x>
$ python ../subsample.py
$ cp nonsampled.csv ../Round<x+1>/sampling_input.csv
$ cd ../Round3
In subsample.py
, set NUM_SAMPLES
172 (1/3rd of the total).
Then, sample for each of the researchers (Lars, Germán, Koen):
$ python ../subsample.py
Use the nonsampled.csv
output as the input file for the next researcher.
Set NUM_SAMPLES
to 17. Then sample from Round3/sampled_{Lars,German,Koen}.csv
:
$ cd ../Round4
$ python ../subsample.py
$ cd Agreement
$ python kappa.py
$ cd ../../QuantitativeAnalysis
$ python codechanges.py
$ cd Languages
$ python languages.py
$ cd ../Data_analysis
$ python merge_categories.py
$ python generate_metrics.py
$ cd ../Statistical_Significance
$ python levene.py
$ python ttest.py