Collaborators:
Anthony F. Botelho, Ph.D.
Ashish Gurung, Ph.D.
link to the data [NOTE: While the code is available under the MIT license the dataset is provided through a different license that can be found here.]
If you wish to replicate the code without going through preprocessing then download 3 csv files from the drive:
- RTD_data_randomsample_20K_new.csv
- hint_infos.csv
- assignment_problem_npc_infos_with_priors.csv
Once you have saved the CSV files in the data folder in your workspace. You need to run the
../analysis/paper_results_replication_file.py
and the results in the paper should be replicated.
[NOTE: As our analysis was exploratory in nature the paper_results_replication_file.py file only facilitates replication of what we reported in the paper. The other files can provide insight into all the other aspects of the user behavior we had explored.]
- libreoffice_prep.py
This is the first code base that sorts the data and ensures that everything is in order and all the additional features generation is automated. This takes the ...random_sample_20K.csv data and outputs a RTD_data_randomsample_20K.csv data.
Once the RDT_data_randomsample_20K.csv is generated the using libre office to generate the feature values is the quicker option. The preprocessing in python is taking forever so had to figure out if it made more sense to have it done in LibreOffice.
- Make sure to run the libreoffice_prep.py beforehand
Run the preprocess to clean the PR and PS columns along with the pair features.
- Generate action_action_pairs
This pairs the relevant actions made by a user per problem to generate the action pairs associated with user made to solve the problem.
Formula:
=IF(M2 = -1, K2, CONCAT(K2, "_", K3))
column M : pr
column K : action_type - Generate action_action_pairs_time_taken
This calculates the time taken by a user for each action pair while solving the problem.
Formula:
=ROUND(IF(OR(L2 <> L3, C2<>C3), 0, G3 - G2), 4)
column L : ps
column G : action_unix_time [1 second = 1 unix time]
column C : user_xid - Generate pr_answered_correctly_pair
This checks if the action pair lead to a correct answer to the pr.
Formula:
=IF(AND(K3="StudentResponseAction", M4=-1), 1, IF(AND(M3=-1, M2 <> -1),P1,0))
column N: action_action_pairs - Generate attempts made per Problem:
This generates all the attemps a student made inorder to answer the pr.
Formula:
=IF(M2 <> -1, IF(K3="StudentResponseAction", Q1+1, Q1), 0)
column M: pr
column K: action_type
column Q: number_of_attempts made in the problem - Generate hint requested per Problem:
This generates all the attemps a student made inorder to answer the pr.
Formula:
=IF(M2 <> -1, IF(K3="HintRequestedAction", R1+1, R1), 0)
column M: pr
column K: action_type
column R: number_of_hints accessed in the problem
- Make sure to run the libreoffice_prep.py beforehand