Ease IMPACT’s data and database officers woRk
impactR
started as a simple project: mainly a reminder of
totally-perfectible functions used and made on the go for the Burkina
Faso team in 2021. It became broader, aiming now to ease data teams
daily R work and to cover most of the research cycle’s tasks.
It is based on three spreadsheets that need to be filled in and coordinated by either assessment officers, data officers or field officers:
- To monitor data collection and get a log to fill: a spreadsheet of logical tests based on the questionnaire and the Kobo tool
- To clean data: a cleaning log that has been (well-)filled
- To analyze data: a data analysis plan
Specs:
- mainly, it is aimed at data collection with Kobo
- it extensively uses the
tidyverse
, andsrvyr
for survey data analysis - since version
0.7.8
, it is considered robust enough and has been tested on 3 different - it requires R 4.1+ (mostly for the native pipe
|>
)
You can install the last version of impactR
from
GitHub with:
# install.packages("devtools")
devtools::install_github("gnoblet/impactR", build_vignettes = T)
From version 0.6, contributions should go with minimal and complete
commits as a good practice. The dev
branch will be used from there.
Well, in practice, it isn’t much.
Roadmap is as follows:
- introduce tidy eval wherever it makes sense
- add (re) count columns post-cleaning for multiple choices columns and simple choice’s other column
- write more documentation
- tidy eval to cleaning functions
- dots not as the last arg, not always at least
- functions to create a small report of the values that effectively changed or were removed when cleaning thanks to a cleaning log
- more robust check cleaning log and check check list functions
- export clean (open)-xlsx files
- add a grouping arg to
make_log_outlier()
- (ongoing) MSNA analysis tools : roster (education, demography, WGI), weighting functions, analysis functions
- (ongoing) Split this big mess into several consolidated small packages : a viz one, an analysis one and a cleaning one
There will be a Shiny app for cleaning and monitoring (in French for
now) whose repo will be
collectoR. It is experimental
and based on older versions of impactR.
Youpi! some documentation:
- The main vignette for the main workflow (fr version)
- The main vignette for the main workflow (en version)
In R, use:
vignette("base_de_travail", "impactR")
vignette("main_workflow", "impactR")
These are basics example of daily uses:
# Attach all functions, equivalent to library("impactR")
box::use(impactR[...])
## basic example codes and uses (not run!)
## Import a csv file with clean names and clean types, do guess types on the max number of linse
# import_csv("data.csv")
## Get colnames for sector foodsec whose variables start with "f_"
# tbl_col_start(data, "f_")
## Group split to a named list
# named_group_split(data, admin2)
## Left join many tibbles
# left_joints(tibble_list, id_col)
## Make an outlier log for all numeric variables in the data.frame/tibble
# make_log_outlier(rawdata, survey, id_col = uuid, i_enum_id)
## Make a log based on logical tests, outliers and "other" answers
# make_all_logs(rawdata,
# survey,
# check_list,
# other = "other_",
# id_col = uuid,
# i_enum_id)
## Clean from log
# make_all_logs(rawdata,
# log,
# survey,
# choices,
# other = "other_",
# id_col = uuid)
## Calculate weigthed proportion for shelter type by group (e.g. administrative areas or population groups)
# svy_prop(design, s_shelter_type, c(admin1, group_pop), na.rm = T, stat_name = "prop", level = 0.95)