diff --git a/DESCRIPTION b/DESCRIPTION old mode 100755 new mode 100644 index 75cbe9166..66d0d79be --- a/DESCRIPTION +++ b/DESCRIPTION @@ -21,10 +21,16 @@ Authors@R: c(person("Vincent T","van Hees",role=c("aut","cre"), comment = c(ORCID = "0000-0002-6033-960X")), person(given = "Taren",family = "Sanders", role = "ctb"), person("Chenxuan","Zhao",role="ctb"), + person("Ian","Meneghel Danilevicz", role="ctb", + email="ian.meneghel-danilevicz @ inserm.fr", + comment = c(ORCID = "0000-0003-4541-0524")), + person("Victor","Barreto Mesquita", role="ctb", + email="victormesquita40@hotmail.com"), person("Gaia","Segantin",role="ctb"), person("Medical Research Council UK", role = c("cph", "fnd")), person("Accelting", role = c("cph", "fnd")), - person("French National Research Agency", role = c("cph", "fnd"))) + person("French National Research Agency", role = c("cph", "fnd")) + ) Maintainer: Vincent T van Hees Description: A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' , binary (.gt3x) and .csv-export data from 'Actigraph' devices, and binary (.cwa) and .csv-export data from 'Axivity' . These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding. URL: https://github.com/wadpac/GGIR/, https://groups.google.com/forum/#!forum/RpackageGGIR, https://wadpac.github.io/GGIR/ diff --git a/NAMESPACE b/NAMESPACE index e99869c32..e6dd214e6 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -34,14 +34,15 @@ export(g.analyse, g.calibrate, CalcSleepRegularityIndex, load_params, check_params, extract_params, g.imputeTimegaps, g.part5.classifyNaps, - tidyup_df, cosinorAnalyses, - detect_nonwear_clipping, applyCosinorAnalyses, + tidyup_df, cosinor_IS_IV_Analyses, + detect_nonwear_clipping, apply_cosinor_IS_IV_Analyses, ShellDoc2Vignette, parametersVignette, correctOlderMilestoneData, convertEpochData, appendRecords, extractID, g.part5_analyseSegment, g.part5_initialise_ts, g.part5.analyseRest, part6AlignIndividuals, - part6PairwiseAggregation, g.part6, g.report.part6, check_log, g.report.part5_dictionary) + part6PairwiseAggregation, g.part6, g.report.part6, + check_log, g.report.part5_dictionary, DFA, ABI, SSP) importFrom("grDevices", "colors", "dev.off", "pdf", @@ -55,8 +56,8 @@ importFrom("utils", "read.csv", "sessionInfo", "write.csv", "memory.limit", "memory.size", "tail", "unzip", "lsf.str") importFrom("stats", "aggregate.data.frame", "weighted.mean", "rnorm","median","aggregate","C", "lm.wfit", - "quantile", "sd","coef", "lm", "embed", "na.pass", - "residuals", "fitted", "reshape", "cor") + "quantile", "sd","coef", "lm", "embed", "na.pass", "na.omit", + "residuals", "fitted", "reshape", "cor", "arima", "lm.fit") import(data.table) importFrom("methods", "is") importFrom("utils", "available.packages", "packageVersion", "help", "read.table") diff --git a/NEWS.md b/NEWS.md index 03f271c36..4b0a6a9bc 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,30 @@ +# CHANGES IN GGIR VERSION 3.1-?? + +- Part 6: + + - Add DFA functionality #839 (credits: Ian Meneghel Danilevicz and Victor Barreto Mesquita) + + - Add fragmentation metrics, same function as in part 5 but now applied at recording level #839 + + - Circadian rhythm analysis now ignore invalid nights controlled with new parameter "includecrit.part6". + + - Circadian rhythm analysis now provides consistent results for both regardless of whether time series are stored with or without valid windows with parameter save_ms5raw_without_invalid. + +- Part 2 + 6: Revised and simplified IV and IS calculation which now ignores invalid timestamps and also comes with phi statistic (credits: Ian Meneghel Danilevicz). In part 2 we used to have 2 calculation, which is now replaced by just one and applied to all valid data points in the recordings. In part 6 this is repeated by for the time window as specified with parameter part6Window. Further, IVIS now uses argument threshold.lig as threshold to distinguish inactivity from active. + +- Part 2: Increase sensitivity clipping detection from 80% to 30% of a window. If the long epoch has more than 30% of its values close to the dynamic range of the sensor then it will be labelled as clipping. + +- Part 4: Now also has copy of part 5 variable ACC_spt_mg. + +- Part 5: + + - Moving LXMX from part 5 to part 6 as these are circadian rhythm analysis. + + - Omitted fragmentation metrics for spt window as not meaningful given that part 5 only focusses on valid daytime behaviours. + - includenightcrit.part5 added to allow for controlling inclusion of windows in part 5 based on their amount of valid data during spt window. Default remains the same. + + - Fix incorrect usage of part 5 inclusion criteria testing, which used fraction as percentage. + # CHANGES IN GGIR VERSION 3.1-5 - Part 5: diff --git a/R/ABI.R b/R/ABI.R new file mode 100644 index 000000000..b534c0329 --- /dev/null +++ b/R/ABI.R @@ -0,0 +1,33 @@ +# NOTE ABOUT DOCUMENTATION: +# GGIR does not use Roxygen. The documentation below is not used. +# All function documentation can be found in the man/*.Rd files. +# Please edit documentation there. +# +#- @title Activity balance index (ABI) +#- +#- @description This function estimates the Activity balance index (ABI), which is a transformation of the self-similarity parameter (SSP), also known as scaling exponent or alpha. +#- @param x the estimated self-similarity parameter (SSP) +#- +#- @details ABI = exp(-abs(SSP-1)/exp(-2)) +#- +#- @return The estimated Activity balance index (ABI) is a real number between zero and one. +#- +#- @author Ian Meneghel Danilevicz +#- +#- @references C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger Phys. Rev. E, 49 (1994), p. 1685 +#- Mesquita, Victor & Filho, Florencio & Rodrigues, Paulo. (2020). Detection of crossover points in detrended fluctuation analysis: An application to EEG signals of patients with epilepsy. Bioinformatics. 10.1093/bioinformatics/btaa955. +#- +#- @examples +#- # Estimate Activity balance index of a very known time series available on R base: the sunspot.year. +#- +#- ssp = SSP(sunspot.year, scale = 1.2) +#- abi = ABI(ssp) +#- +ABI = function(x){ + if (is.na(x)) { + y = NA + } else { + y = exp(-abs(x - 1) * exp(2)) + } + return(y) +} \ No newline at end of file diff --git a/R/DFA.R b/R/DFA.R new file mode 100644 index 000000000..06a7dff5a --- /dev/null +++ b/R/DFA.R @@ -0,0 +1,86 @@ +# NOTE ABOUT DOCUMENTATION: +# GGIR does not use Roxygen. The documentation below is not used. +# All function documentation can be found in the man/*.Rd files. +# Please edit documentation there. +# +#- @title Detrended Fluctuation Analysis +#- +#- @description ... +#- @usage DFA(data, scale = 2^(1/8), box_size = 4, m = 1) +#- @param data Univariate time series (must be a vector or data frame) +#- @param scale Specifies the ratio between successive box sizes (by default scale = 2^(1/8)) +#- @param box_size Vector of box sizes (must be used in conjunction with scale = "F") +#- @param m An integer of the polynomial order for the detrending (by default m=1) +#- +#- @details The DFA fluctuation can be computed in a geometric scale or for different choices of boxes sizes. +#- +#- @return Estimated alpha is a real number between zero and two. +#- +#- @note It is not possible estimating alpha for multiple time series at once. +#- +#- @author Ian Meneghel Danilevicz and Victor Mesquita +#- +#- @references C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger Phys. Rev. E, 49 (1994), p. 1685 +#- Mesquita, Victor & Filho, Florencio & Rodrigues, Paulo. (2020). Detection of crossover points in detrended fluctuation analysis: An application to EEG signals of patients with epilepsy. Bioinformatics. 10.1093/bioinformatics/btaa955. +#- +#- @examples +#- # Estimate self-similarity of a very known time series available on R base: the sunspot.year. +#- # Then the spend time with each method is compared. +#- + +DFA = function(data, scale = 2^(1/8), box_size = 4, m = 1){ + if (inherits(x = data, "data.frame")) { + data = data[, 1] + } + N = length(data) + if (scale != "F") { + box_size <- NULL + n = 1 + n_aux <- 0 + box_size[1] <- 4 + for (n in 1:N) { + while (n_aux < round(N/4)) { + n_aux <- box_size[1] + n = n + 1 + box_size[n] <- ceiling(scale * box_size[n - 1]) + n_aux <- box_size[n] + 4 + } + } + } + ninbox2 <- NULL + for (j in 1:length(box_size)) { + ninbox2[j] <- N %/% box_size[j] + } + aux_seq = seq_along(ninbox2) + aux_length = aux_seq[length(aux_seq)] + y_k = cumsum(data) - mean(data) + rm(data, n_aux, scale) + aux_mat = matrix(nrow = aux_length, ncol = 2) + for (j in seq_along(ninbox2)) { + W = box_size[j] * ninbox2[j] + aux_j = numeric(W) + fit = y_k + for (i in seq_len(W)) { + if (i == 1) { + aux_j[1] = box_size[j] + tmp1 = i:aux_j[i] + } else if (i >= 2) { + aux_j[i] = aux_j[i - 1] + box_size[j] + tmp1 = (aux_j[i - 1] + 1):(aux_j[i]) + } + if (m == 1) { + fit[tmp1] = lm.fit(x = cbind(1, tmp1), y = y_k[tmp1])$fitted.values + } else { + fit[tmp1] = lm(y_k[tmp1] ~ poly(tmp1, m, raw = TRUE))$fitted.values + } + if (i >= ninbox2[j]) { + aux_j[i] <- 0 + } + } + aux_mat[j,] = c(round(box_size[j], digits = 0), + sqrt((1 / N) * sum((y_k[1:W] - fit[1:W]) ^ 2))) + } + colnames(aux_mat) <- c("box", "DFA") + aux_list = aux_mat + return(aux_list) +} \ No newline at end of file diff --git a/R/GGIR.R b/R/GGIR.R index 95e658528..a7f825cf3 100644 --- a/R/GGIR.R +++ b/R/GGIR.R @@ -320,7 +320,7 @@ GGIR = function(mode = 1:5, datadir = c(), outputdir = c(), } g.part6(datadir = datadir, metadatadir = metadatadir, f0 = f0, f1 = f1, params_general = params_general, params_phyact = params_phyact, - params_247 = params_247, + params_247 = params_247, params_cleaning = params_cleaning, verbose = verbose) } diff --git a/R/SSP.R b/R/SSP.R new file mode 100644 index 000000000..4a914ce1e --- /dev/null +++ b/R/SSP.R @@ -0,0 +1,45 @@ +# NOTE ABOUT DOCUMENTATION: +# GGIR does not use Roxygen. The documentation below is not used. +# All function documentation can be found in the man/*.Rd files. +# Please edit documentation there. +# +#- @title Estimated self-similarity parameter +#- +#- @description This function estimates the self-similarity parameter (SSP), also known as scaling exponent or alpha. +#- @usage alpha_hat(data,scale = 2^(1/8),box_size = 4,m=1) +#- @param data Univariate time series (must be a vector or data frame) +#- @param scale Specifies the ratio between successive box sizes (by default scale = 2^(1/8)) +#- @param box_size Vector of box sizes (must be used in conjunction with scale = "F") +#- @param m An integer of the polynomial order for the detrending (by default m=1) +#- +#- @details The DFA fluctuation can be computed in a geometric scale or for different choices of boxes sizes. +#- +#- @return Estimated alpha is a real number between zero and two. +#- +#- @note It is not possible estimating alpha for multiple time series at once. +#- +#- @author Ian Meneghel Danilevicz and Victor Mesquita +#- +#- @references C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger Phys. Rev. E, 49 (1994), p. 1685 +#- Mesquita, Victor & Filho, Florencio & Rodrigues, Paulo. (2020). Detection of crossover points in detrended fluctuation analysis: An application to EEG signals of patients with epilepsy. Bioinformatics. 10.1093/bioinformatics/btaa955. +#- +#- @examples +#- # Estimate self-similarity of a very known time series available on R base: the sunspot.year. +#- # Then the spend time with each method is compared. +#- +#- SSP(sunspot.year, scale = 2) +#- SSP(sunspot.year, scale = 1.2) + +SSP = function(data, scale = 2^(1/8), box_size = 4, m = 1){ + if (inherits(x = data, "data.frame")) { + data = data[, 1] + } + if (length(data) <= box_size || any(is.na(data))) { + alpha_hat = NA + } else { + dfa_hat = DFA(data, scale = scale, box_size = box_size, m = m) + est_ols = lm(log(dfa_hat[,2]) ~ log(dfa_hat[,1])) + alpha_hat = est_ols$coefficients[[2]] + } + return(alpha_hat) +} \ No newline at end of file diff --git a/R/applyCosinorAnalyses.R b/R/apply_cosinor_IS_IV_Analyses.R similarity index 89% rename from R/applyCosinorAnalyses.R rename to R/apply_cosinor_IS_IV_Analyses.R index 24dbb4a64..13dd9b764 100644 --- a/R/applyCosinorAnalyses.R +++ b/R/apply_cosinor_IS_IV_Analyses.R @@ -1,10 +1,10 @@ -applyCosinorAnalyses = function(ts, qcheck, midnightsi, epochsizes) { +apply_cosinor_IS_IV_Analyses = function(ts, qcheck, midnightsi, epochsizes, threshold = NULL) { # qcheck - vector of length ts to indicate invalid values ws2 = epochsizes[2] ws3 = epochsizes[1] # Re-derive Xi but this time include entire time series # Here, we ignore duplicated values (when clock moves backward due to DST) - handleDST = !duplicated(ts) + handleDST = !duplicated(ts$time) qcheck = qcheck[handleDST] Xi = ts[handleDST, grep(pattern = "time", x = colnames(ts), invert = TRUE)] Nlong_epochs_day = (1440 * 60) / ws2 # this is 96 by default @@ -59,11 +59,8 @@ applyCosinorAnalyses = function(ts, qcheck, midnightsi, epochsizes) { } else { epochsize = ws3 } - # log transform of data in millig - notna = !is.na(Xi) - Xi[notna] = log((Xi[notna]*1000) + 1) - - cosinor_coef = cosinorAnalyses(Xi = Xi, epochsize = epochsize, timeOffsetHours = timeOffsetHours) + cosinor_coef = cosinor_IS_IV_Analyses(Xi = Xi, epochsize = epochsize, + timeOffsetHours = timeOffsetHours, threshold = threshold) cosinor_coef$timeOffsetHours = timeOffsetHours } else { cosinor_coef = c() diff --git a/R/check_params.R b/R/check_params.R index 74ce70145..9d63fa7b6 100644 --- a/R/check_params.R +++ b/R/check_params.R @@ -84,7 +84,7 @@ check_params = function(params_sleep = c(), params_metrics = c(), "IVIS.activity.metric", "IVIS_acc_threshold", "qM5L5", "MX.ig.min.dur", "M5L5res", "winhr", "LUXthresholds", "LUX_cal_constant", "LUX_cal_exponent", "LUX_day_segments", "L5M5window") - boolean_params = c("cosinor", "part6CR", "part6HCA") + boolean_params = c("cosinor", "part6CR", "part6HCA", "part6DFA") character_params = c("qwindow_dateformat", "part6Window") check_class("247", params = params_247, parnames = numeric_params, parclass = "numeric") check_class("247", params = params_247, parnames = boolean_params, parclass = "boolean") @@ -104,7 +104,7 @@ check_params = function(params_sleep = c(), params_metrics = c(), if (length(params_cleaning) > 0) { numeric_params = c("includedaycrit", "ndayswindow", "data_masking_strategy", "maxdur", "hrs.del.start", "hrs.del.end", "includedaycrit.part5", "minimum_MM_length.part5", - "includenightcrit", "max_calendar_days") + "includenightcrit", "max_calendar_days", "includecrit.part6", "includenightcrit.part5") boolean_params = c("excludefirstlast.part5", "do.imp", "excludefirstlast", "excludefirst.part4", "excludelast.part4", "nonWearEdgeCorrection") character_params = c("data_cleaning_file", "TimeSegments2ZeroFile") @@ -277,12 +277,22 @@ check_params = function(params_sleep = c(), params_metrics = c(), replacement = "/", x = params_cleaning[["data_cleaning_file"]]) } if (params_cleaning[["includedaycrit.part5"]] < 0) { - warning("\nNegative value of includedaycrit.part5 is not allowed, please change.") + stop("\nNegative value of includedaycrit.part5 is not allowed, please change.") } else if (params_cleaning[["includedaycrit.part5"]] > 24) { - warning(paste0("\nIncorrect value of includedaycrit.part5, this should be", + stop(paste0("\nIncorrect value of includedaycrit.part5, this should be", " a fraction of the day between zero and one or the number ", "of hours in a day.")) } + if (params_cleaning[["includenightcrit.part5"]] < 0) { + stop("\nNegative value of includenightcrit.part5 is not allowed, please change.") + } else if (params_cleaning[["includenightcrit.part5"]] > 24) { + stop(paste0("\nIncorrect value of includenightcrit.part5, this should be", + " a fraction of the day between zero and one or the number ", + "of hours in a day.")) + } + if (any(params_cleaning[["includecrit.part6"]] < 0) | any(params_cleaning[["includecrit.part6"]] > 1)) { + stop("Values of includecrit.part6 are not in the range [0, 1]. Please fix.") + } } if (length(params_phyact) > 0) { if (length(params_phyact[["bout.metric"]]) > 0 | @@ -345,7 +355,7 @@ check_params = function(params_sleep = c(), params_metrics = c(), } } # params 247 & params output - if (params_247[["part6HCA"]] == TRUE) { + if (params_247[["part6HCA"]] == TRUE || params_247[["part6CR"]] == TRUE) { # Add RData because part 6 will need it params_output[["save_ms5raw_format"]] = unique(c(params_output[["save_ms5raw_format"]], "RData")) params_output[["save_ms5rawlevels"]] = TRUE diff --git a/R/cosinorAnalyses.R b/R/cosinorAnalyses.R deleted file mode 100644 index c66c54972..000000000 --- a/R/cosinorAnalyses.R +++ /dev/null @@ -1,45 +0,0 @@ -cosinorAnalyses = function(Xi, epochsize = 60, timeOffsetHours = 0) { - # Apply Cosinor function from ActRC - N = 1440 * (60 / epochsize) # Number of epochs per day - Xi = Xi[1:(N * floor(length(Xi) / N))] # ActCR expects integer number of days - coef = ActCR::ActCosinor(x = Xi, window = 1440 / N) - - # Apply Extended Cosinor function from ActRC (now temporarily turned of to apply my own version) - # ActCR:: - coefext = ActCR::ActExtendCosinor(x = Xi, window = 1440 / N, export_ts = TRUE) # need to set lower and upper argument? - # Correct time estimates by offset in start of recording - add24ifneg = function(x) { - if (x < 0) x = x + 24 - return(x) - } - coef$params$acrotime = add24ifneg(coef$params$acrotime - timeOffsetHours) - coefext$params$UpMesor = add24ifneg(coefext$params$UpMesor - timeOffsetHours) - coefext$params$DownMesor = add24ifneg(coefext$params$DownMesor - timeOffsetHours) - coefext$params$acrotime = add24ifneg(coefext$params$acrotime - timeOffsetHours) - # do same for acrophase in radians (24 hours: 2 * pi) - # take absolute value of acrophase, because it seems ActCR provides negative value in radians, - # which is inverse correlated with acrotime - coef$params$acr = abs(coef$params$acr) - ((timeOffsetHours / 24) * 2 * pi) - k = ceiling(abs(coef$params$acr) / (pi * 2)) - if (coef$params$acr < 0) coef$params$acr = coef$params$acr + (k * 2 * pi) - # Perform IVIS on the same input signal to allow for direct comparison - IVIS = g.IVIS(Xi = Xi / 1000, # divide by 1000 because function g.IVIS internally multiplies by 1000 when IVIS.activity.metric = 2 - epochsizesecondsXi = epochsize, - IVIS_windowsize_minutes = 60, - IVIS.activity.metric = 2, - IVIS_acc_threshold = log(20 + 1), - IVIS_per_daypair = TRUE) # take log, because Xi is logtransformed with offset of 1 - - coefext$params$R2 = cor(coefext$cosinor_ts$original, coefext$cosinor_ts$fittedYext)^2 - coef$params$R2 = cor(coefext$cosinor_ts$original, coefext$cosinor_ts$fittedY)^2 - - # # this should equal: https://en.wikipedia.org/wiki/Coefficient_of_determination - # yi = coefext$cosinor_ts$original - # fi = coefext$cosinor_ts$fittedY - # meanY = mean(coefext$cosinor_ts$original) - # SSres = sum((yi - fi)^2) - # SStot = sum((y - meanY)^2) - # R2 = 1 - (SSres / SStot) - - invisible(list(coef = coef, coefext = coefext, IVIS = IVIS)) -} \ No newline at end of file diff --git a/R/cosinor_IS_IV_Analyses.R b/R/cosinor_IS_IV_Analyses.R new file mode 100644 index 000000000..a4f57b30d --- /dev/null +++ b/R/cosinor_IS_IV_Analyses.R @@ -0,0 +1,62 @@ +cosinor_IS_IV_Analyses = function(Xi, epochsize = 60, timeOffsetHours = 0, threshold = NULL) { + if (length(threshold) > 1) { + threshold = threshold[1] + warning("Multiple threshold values supplied to cosinor analysis, only first value used.") + } + # Apply Cosinor function from ActRC + N = 1440 * (60 / epochsize) # Number of epochs per day + Xi = Xi[1:(N * floor(length(Xi) / N))] # ActCR expects integer number of days + + # omit all days at the end with no data + end2 = length(Xi) + while (end2 >= N) { + if (all(is.na(Xi[(end2 - N + 1):end2]))) { + end2 = end2 - N + Xi = Xi[1:end2] + } else { + end2 = -1 + } + } + # transform data to millig if data is stored in g-units + notna = !is.na(Xi) + if (max(Xi, na.rm = TRUE) < 13 && mean(Xi, na.rm = TRUE) < 1) { + # 13 because a typical 8g accelerometer could in theory measure 7.5 in each axis without + # being considered clipping, which results in a vector of 13 + # as soon as the time series has values above 13 then it is most likely that + # it is either expressed in counts or in mg. + Xi[notna] = Xi[notna] * 1000 + } + # log transform data for ActCosinor, IV IS further down will use the non-transformed signal + Xi_log = Xi + Xi_log[notna] = log(Xi[notna] + 1) + coef = ActCR::ActCosinor(x = Xi_log, window = 1440 / N) + + # Apply Extended Cosinor function from ActRC (now temporarily turned of to apply my own version) + coefext = ActCR::ActExtendCosinor(x = Xi_log, window = 1440 / N, export_ts = TRUE) + # Correct time estimates by offset in start of recording + add24ifneg = function(x) { + if (x < 0) x = x + 24 + return(x) + } + coef$params$acrotime = add24ifneg(coef$params$acrotime - timeOffsetHours) + coefext$params$UpMesor = add24ifneg(coefext$params$UpMesor - timeOffsetHours) + coefext$params$DownMesor = add24ifneg(coefext$params$DownMesor - timeOffsetHours) + coefext$params$acrotime = add24ifneg(coefext$params$acrotime - timeOffsetHours) + # do same for acrophase in radians (24 hours: 2 * pi) + # take absolute value of acrophase, because it seems ActCR provides negative value in radians, + # which is inverse correlated with acrotime + coef$params$acr = abs(coef$params$acr) - ((timeOffsetHours / 24) * 2 * pi) + k = ceiling(abs(coef$params$acr) / (pi * 2)) + if (coef$params$acr < 0) coef$params$acr = coef$params$acr + (k * 2 * pi) + # Perform IVIS on the same input signal to allow for direct comparison + IVIS = g.IVIS(Xi = Xi, + epochSize = epochsize, + threshold = threshold) # take log, because Xi is logtransformed with offset of 1 + + coefext$params$R2 = cor(coefext$cosinor_ts$original, coefext$cosinor_ts$fittedYext)^2 + coef$params$R2 = cor(coefext$cosinor_ts$original, coefext$cosinor_ts$fittedY)^2 + + # this should equal: https://en.wikipedia.org/wiki/Coefficient_of_determination + + invisible(list(coef = coef, coefext = coefext, IVIS = IVIS)) +} \ No newline at end of file diff --git a/R/g.IVIS.R b/R/g.IVIS.R index ab24fa610..9527febee 100644 --- a/R/g.IVIS.R +++ b/R/g.IVIS.R @@ -1,108 +1,45 @@ -g.IVIS = function(Xi, epochsizesecondsXi = 5, IVIS_epochsize_seconds=c(), - IVIS_windowsize_minutes = 60, IVIS.activity.metric = 1, - IVIS_acc_threshold = 20, - IVIS_per_daypair = FALSE) { - if (length(IVIS_epochsize_seconds) > 0) { - warning("Argument IVIS_epochsize_seconds has been depricated and has been ignored.") +g.IVIS = function(Xi, epochSize = 60, threshold = NULL) { + if (!is.null(threshold)) { + Xi = ifelse(Xi > threshold, 0, 1) } - IVIS_epochsize_seconds = IVIS_windowsize_minutes*60 - if (IVIS.activity.metric == 2) { # use binary scoring - Xi = ifelse((Xi * 1000) < IVIS_acc_threshold, 0, 1) # explored on 5 June 2018 as an attempt to better mimic original ISIV construct - Xi = zoo::rollsum(x = Xi, k = (600/epochsizesecondsXi)) # explored on 5 June 2018 as an first attempt to better mimic original ISIV construct + if (epochSize < 3600) { + # Always aggregate Xi to 1 hour resolution if it is not provided at 1 hour resolution + df = data.frame(Xi = Xi, time = numeric(length(Xi))) + time = seq(0, length(Xi) * epochSize, by = epochSize) + df$time = time[1:nrow(df)] + df = aggregate(x = df, by = list(floor(df$time / 3600) * 3600), + FUN = mean, na.action = na.pass, na.rm = TRUE) + Xi = df$Xi } - if (length(Xi) > (IVIS_epochsize_seconds/epochsizesecondsXi) & length(Xi) > (IVIS_windowsize_minutes * 60)/epochsizesecondsXi) { - if (IVIS_epochsize_seconds > epochsizesecondsXi) { # downsample Xi now from minute to hour (IVIS_windowsize) - # aggregation replaced 14 April 2022 to make it able to handle missing values - tmp = data.frame(Xi = Xi, time = numeric(length(Xi))) - time = seq(0, length(Xi) * epochsizesecondsXi, by = epochsizesecondsXi) - tmp$time = time[1:nrow(tmp)] - tmp = aggregate(x = tmp, by = list(floor(tmp$time / IVIS_epochsize_seconds) * IVIS_epochsize_seconds), - FUN = mean, na.action = na.pass) - Xi = tmp$Xi + # ts hourly time series, including NA values + hour = (1:ceiling(length(Xi))) - 1 + ts = data.frame(Xi = Xi, hour = hour, stringsAsFactors = TRUE) + IS = IV = phi = NA + if (nrow(ts) > 1) { + ts$day = floor(ts$hour/24) + 1 + ts$hour = ts$hour - (floor(ts$hour / 24) * 24) # 24 hour in a day + if (nrow(ts) > 1) { + # Average day + aveDay = aggregate(. ~ hour, data = ts, mean, na.action = na.omit) + Xh = aveDay$Xi + Xm = suppressWarnings(mean(Xh,na.rm = TRUE)) # Average acceleration per day + deltaXi = diff(Xi)^2 + N = length(Xi[!is.na(Xi)]) + + # phi + model = arima(Xi[!is.na(Xi)], order = c(1, 0, 0)) + phi = model$coef[[1]] + + # IS: lower is less synchronized with the 24 hour zeitgeber + ISnum = sum((Xh - Xm)^2, na.rm = TRUE) * N + ISdenom = 24 * sum((Xi - Xm)^2, na.rm = TRUE) + IS = ISnum / ISdenom + + #IV: higher is more variability within days (fragmentation) + IVnum = sum(deltaXi, na.rm = TRUE) * N + IVdenom = (N - 1) * sum((Xi - Xm)^2, na.rm = TRUE) + IV = IVnum / IVdenom } - nhr = 24 * round(60 / IVIS_windowsize_minutes) # Number of hours in a day (modify this variable if you want to study different resolutions) - Nsecondsinday = 24 * 3600 - ni = (Nsecondsinday/nhr)/IVIS_epochsize_seconds # number of epochs in an hour - # derive average day with 1 'hour' resolution (hour => windowsize): - N = length(Xi) # - hour = rep(1:ceiling(N / ni), each = ni) - 1 - if (length(hour) > N) hour = hour[1:N] - dat = data.frame(Xi = Xi, hour = hour, stringsAsFactors = TRUE) - InterdailyStability = NA - IntradailyVariability = NA - if (nrow(dat) > 1) { - hh = aggregate(. ~ hour, data = dat, mean, na.action = na.pass) - hh$hour_perday = hh$hour - (floor(hh$hour / nhr) * nhr) # 24 hour in a day - hh$day = floor(hh$hour/nhr) + 1 - if (nrow(hh) > 1) { - if (IVIS_per_daypair == FALSE) { - hh2 = aggregate(. ~ hour_perday, data = hh, mean, na.action = na.pass) - Xh = hh2$Xi - Xm = suppressWarnings(mean(Xh,na.rm = TRUE)) # Average acceleration per day - p = length(which(is.na(Xh) == FALSE)) - N = length(Xi[which(is.na(Xi) == FALSE)]) - dat$day = floor(dat$hour/nhr) + 1 - S = aggregate(dat$Xi, by = list(dat$day), FUN = function(x) sum(diff(c(x))^2)) - InterdailyStability = (sum((Xh - Xm)^2, na.rm = TRUE) * N) / (p * sum((Xi - Xm)^2, na.rm = TRUE)) # IS: lower is less synchronized with the 24 hour zeitgeber - IntradailyVariability = (sum(S$x, na.rm = TRUE) * N) / ((N - 1) * sum((Xm - Xi)^2, na.rm = TRUE)) #IV: higher is more variability within days (fragmentation) - } else { - hh$hour_perday = hh$hour - (floor(hh$hour / nhr) * nhr) # 24 hour in a day - hh$day = floor(hh$hour/nhr) + 1 - # Calculate IV and IS per day and then aggregate over days - # weighted by number of valid hours per day, while ignoring missing values - NR = ceiling(nrow(hh) / 24) - uniquedays = 1:NR - IVIS_daypair_summary = data.frame(IS = numeric(NR - 1), - IV = numeric(NR - 1), - fracvalid = numeric(NR - 1), - daypair = character(NR - 1)) - for (i in 1:(NR - 1)) { # Loop over all days - thisday = which(hh$day == uniquedays[i]) - nextday = which(hh$day == uniquedays[i + 1]) - if (length(thisday) >= 23 & length(nextday) >= 23) { - if (length(thisday) == 25) { - thisday = thisday[1:24] - } else if (length(thisday) == 23) { - nextday = nextday[1:23] - } - if (length(nextday) == 25) { - nextday = nextday[1:24] - } else if (length(nextday) == 23) { - thisday = thisday[1:23] - } - hh.t = hh[c(thisday, nextday), ] - testNA = c(is.na(hh.t$Xi) | is.na(hh.t$Xi)) - Xi = hh.t$Xi - hh2 = aggregate(. ~ hour_perday, data = hh.t, mean, na.action = na.pass) - Xh = hh2$Xi - Xm = suppressWarnings(mean(Xh,na.rm = TRUE)) - p = length(which(is.na(Xh) == FALSE)) - N = length(Xi[which(is.na(Xi) == FALSE)]) - S = aggregate(hh.t$Xi, by = list(hh.t$day), FUN = function(x) sum(diff(c(x))^2, na.rm = TRUE) ) - IVIS_daypair_summary$IS[i] = (sum((Xh - Xm)^2, na.rm = TRUE) * N) / (p * sum((Xi - Xm)^2, na.rm = TRUE)) # IS: lower is less synchronized with the 24 hour zeitgeber - IVIS_daypair_summary$IV[i] = (sum(S$x, na.rm = TRUE) * N) / ((N - 1) * sum((Xm - Xi)^2, na.rm = TRUE)) #IV: higher is more variability within days (fragmentation) - IVIS_daypair_summary$fracvalid[i] = length(which(testNA == FALSE)) / length(testNA) - IVIS_daypair_summary$daypair[i] = paste0(i, "-", i + 1) - } - } - IVIS_daypair_summary$fracvalid = ifelse(IVIS_daypair_summary$fracvalid < 0.66, yes = 0, no = IVIS_daypair_summary$fracvalid) - if (length(which(IVIS_daypair_summary$fracvalid == 1)) > 0) { # expectation that at least 1 day pair is complete - IVIS_daypair_summary = IVIS_daypair_summary[which(IVIS_daypair_summary$fracvalid != 0), ] - InterdailyStability = mean(x = IVIS_daypair_summary$IS, w = IVIS_daypair_summary$fracvalid) - IntradailyVariability = mean(x = IVIS_daypair_summary$IV, w = IVIS_daypair_summary$fracvalid) - } else { - InterdailyStability = NA - IntradailyVariability = NA - } - } - } else { - InterdailyStability = NA - IntradailyVariability = NA - } - } - } else { - InterdailyStability = NA - IntradailyVariability = NA } - invisible(list(InterdailyStability = InterdailyStability, IntradailyVariability = IntradailyVariability)) + invisible(list(InterdailyStability = IS, IntradailyVariability = IV, phi = phi)) } diff --git a/R/g.analyse.R b/R/g.analyse.R index b5cdf2267..efd840998 100644 --- a/R/g.analyse.R +++ b/R/g.analyse.R @@ -207,7 +207,7 @@ g.analyse = function(I, C, M, IMP, params_247 = c(), params_phyact = c(), quantiletype = quantiletype, ws3 = ws3, doiglevels = doiglevels, firstmidnighti = firstmidnighti, ws2 = ws2, midnightsi = midnightsi, params_247 = params_247, qcheck = qcheck, - acc.metric = acc.metric) + acc.metric = acc.metric, params_phyact = params_phyact) cosinor_coef = output_avday$cosinor_coef #-------------------------------------------------------------- diff --git a/R/g.analyse.avday.R b/R/g.analyse.avday.R index b2874c6a1..e3536c2b9 100644 --- a/R/g.analyse.avday.R +++ b/R/g.analyse.avday.R @@ -1,6 +1,6 @@ g.analyse.avday = function(doquan, averageday, M, IMP, t_TWDI, quantiletype, ws3, doiglevels, firstmidnighti, ws2, midnightsi, params_247 = c(), - qcheck = c(), acc.metric = c(), ...) { + qcheck = c(), acc.metric = c(), params_phyact = NULL, ...) { #get input variables input = list(...) if (length(input) > 0 || length(params_247) == 0) { @@ -99,35 +99,21 @@ g.analyse.avday = function(doquan, averageday, M, IMP, t_TWDI, quantiletype, } } } - # IS and IV variables - fmn = midnightsi[1] * (ws2/ws3) # select data from first midnight to last midnight because we need full calendar days to compare - lmn = (midnightsi[length(midnightsi)] * (ws2/ws3)) - 1 - # By using the metashort from the IMP we do not need to ignore segments, because data imputed - Xi = IMP$metashort[fmn:lmn, acc.metric] - # IV IS - IVISout = g.IVIS(Xi, epochsizesecondsXi = ws3, - IVIS_epochsize_seconds = params_247[["IVIS_epochsize_seconds"]], - IVIS_windowsize_minutes = params_247[["IVIS_windowsize_minutes"]], - IVIS.activity.metric = params_247[["IVIS.activity.metric"]], - IVIS_acc_threshold = params_247[["IVIS_acc_threshold"]]) - InterdailyStability = IVISout$InterdailyStability - IntradailyVariability = IVISout$IntradailyVariability - rm(Xi) - #---------------------------------- - # (Extended) Cosinor analysis + # Cosinor analysis, (Extended) Cosinor analysis, including IV, IS, and phi + # based on time series where invalid data points are set to NA if (params_247[["cosinor"]] == TRUE) { - cosinor_coef = applyCosinorAnalyses(ts = IMP$metashort[, c("timestamp", acc.metric)], - qcheck = qcheck, - midnightsi, epochsizes = c(ws3, ws2)) + cosinor_coef = apply_cosinor_IS_IV_Analyses(ts = IMP$metashort[, c("timestamp", acc.metric)], + qcheck = qcheck, + midnightsi, epochsizes = c(ws3, ws2), + threshold = params_phyact[["threshold.lig"]][1]) # only use one threshold } else { cosinor_coef = c() } - invisible(list(InterdailyStability = InterdailyStability, - IntradailyVariability = IntradailyVariability, - igfullr_names = igfullr_names, + invisible(list(igfullr_names = igfullr_names, igfullr = igfullr, QUAN = QUAN, qlevels_names = qlevels_names, ML5AD = ML5AD, - ML5AD_names = ML5AD_names, cosinor_coef = cosinor_coef)) + ML5AD_names = ML5AD_names, + cosinor_coef = cosinor_coef)) } \ No newline at end of file diff --git a/R/g.analyse.perfile.R b/R/g.analyse.perfile.R index 687a736da..594b9b63a 100644 --- a/R/g.analyse.perfile.R +++ b/R/g.analyse.perfile.R @@ -160,14 +160,7 @@ g.analyse.perfile = function(I, C, metrics_nav, if (length(iNA) > 0) filesummary[(vi:(vi + 1))[iNA]] = 0 s_names[vi:(vi + 1)] = c("N valid WEdays","N valid WKdays") vi = vi + 2 - # Add ISIV to filesummary - filesummary[vi:(vi + 2)] = c(output_avday$InterdailyStability, output_avday$IntradailyVariability, - params_247[["IVIS_windowsize_minutes"]]) - iNA = which(is.na(filesummary[vi:(vi + 3)]) == TRUE) - if (length(iNA) > 0) filesummary[(vi:(vi + 3))[iNA]] = " " - s_names[vi:(vi + 2)] = c("IS_interdailystability", "IV_intradailyvariability", "IVIS_windowsize_minutes") - vi = vi + 4 - # Cosinor analysis + # Cosinor analysis + IV + IS + phi if (length(cosinor_coef) > 0) { filesummary[vi] = c(cosinor_coef$timeOffsetHours) s_names[vi] = c("cosinor_timeOffsetHours") @@ -197,12 +190,13 @@ g.analyse.perfile = function(I, C, metrics_nav, "cosinorExt_DownMesor", "cosinorExt_MESOR", "cosinorExt_ndays", "cosinorExt_F_pseudo", "cosinorExt_R2") vi = vi + 11 - filesummary[vi:(vi + 1)] = c(cosinor_coef$IVIS$InterdailyStability, - cosinor_coef$IVIS$IntradailyVariability) - s_names[vi:(vi + 1)] = c("cosinorIS", "cosinorIV") - vi = vi + 2 + filesummary[vi:(vi + 2)] = c(cosinor_coef$IVIS$InterdailyStability, + cosinor_coef$IVIS$IntradailyVariability, + cosinor_coef$IVIS$phi) + s_names[vi:(vi + 2)] = c("IS", "IV", "phi") + vi = vi + 3 } else { - vi = vi + 20 + vi = vi + 21 } # Variables per metric - summarise with stratification to weekdays and weekend days diff --git a/R/g.fragmentation.R b/R/g.fragmentation.R index c1b10caaf..72d6ff4cd 100644 --- a/R/g.fragmentation.R +++ b/R/g.fragmentation.R @@ -3,19 +3,12 @@ g.fragmentation = function(frag.metrics = c("mean", "TP", "Gini", "power", LEVELS = c(), Lnames=c(), xmin=1, mode = "day") { - - # This function is loosely inspired by R package ActFrag by Junrui Di. + # This function was originally loosely inspired by R package ActFrag by Junrui Di. # In contrast to R package ActFrag this function assumes # that non-wear and missing values have already been taken care of (imputed) - # outside this function. Further, the algorithms are not all exactly the same, - # and there are some additional metrics. - # This function is called from GGIR g.part5 function and applied per waking - # hours of a day (by default, alternative option spt, see mode). - # This avoids the issue of dealing with gaps between days, - # and allows us to test for behavioural differences between days of the week. - # It is well known that human behaviour can be driven by weekly rhythm. - # Knowing fragmentation per day of the week allows us to account for this - # variation. + # outside this function or set to NA. Further, the algorithms are not all exactly the same, + # and there are some additional metrics. For further dicussion see + # https://wadpac.github.io/GGIR/articles/chapter11_DescribingDataCutPoints.html # Further, I am avoiding the term sedentary because sedentary implies # that the activity type sitting was classified, which is generally # difficult to justify. @@ -40,10 +33,31 @@ g.fragmentation = function(frag.metrics = c("mean", "TP", "Gini", "power", frag$value[2:Nsegments] %in% a) Nab = length(ab) Nba = length(ba) - totDur_a = sum(frag$length[which(frag$value %in% a)]) - ifelse(frag$value[Nsegments] %in% a,1,0) - totDur_b = sum(frag$length[which(frag$value %in% b)]) - ifelse(frag$value[Nsegments] %in% b,1,0) - TPab = (Nab + 1e-6 ) / (totDur_a + 1e-6) - TPba = (Nba + 1e-6 ) / (totDur_b + 1e-6) + + # In GGIR part 6 we set day or spt to NA, by which the series is split in parts + # separate by strings of NA values. In GGIR part 5 this would not happen + # and we only need the last item of the time series for that day. However, + # if last value is NA then ignore it. + lastitems = which(is.na(frag$value[1:(Nsegments - 1)]) == FALSE & + is.na(frag$value[2:Nsegments] == TRUE)) + if (!is.na(frag$value[Nsegments])) { + lastitems = unique(c(lastitems, Nsegments)) + } + tmp = frag$value[-lastitems] + # Next use this to detect how often last value of each string is reference class (a or b) + if (length(lastitems) > 0 & length(which(is.na(tmp) == FALSE)) > 1) { + count_a_ending = sum(ifelse(frag$value[lastitems] %in% a, 1, 0)) + count_b_ending = sum(ifelse(frag$value[lastitems] %in% b, 1, 0)) + } else { + count_a_ending = 0 + count_b_ending = 0 + } + # Total duration from a to b + totDur_ab = sum(frag$length[which(frag$value %in% a)]) - count_a_ending + totDur_ba = sum(frag$length[which(frag$value %in% b)]) - count_b_ending + epsilon = 1e-6 + TPab = (Nab + epsilon) / (totDur_ab + epsilon) + TPba = (Nba + epsilon) / (totDur_ba + epsilon) # Round to 6 digits because preceding step introduces bias at 7 decimal places TPab = round(TPab, digits = 6) TPba = round(TPba, digits = 6) @@ -77,13 +91,11 @@ g.fragmentation = function(frag.metrics = c("mean", "TP", "Gini", "power", totDur_ba = totDur_ba)) } - if ("all" %in% frag.metrics) { frag.metrics = c("mean", "TP", "Gini", "power", "CoV", "NFragPM", "all") } output = list() - Nepochs = length(LEVELS) if (mode == "day") { # convert to class names to numeric class ids for inactive, LIPA and MVPA: @@ -96,11 +108,11 @@ g.fragmentation = function(frag.metrics = c("mean", "TP", "Gini", "power", } else { do.frag = FALSE } - if (Nepochs > 1 & mode == "day") { # metrics that require more than just binary #==================================================== # Convert LEVELS in three classes: Inactivity (1), Light = LIPA (2), and MVPA (3) y = rep(0, Nepochs) + is.na(y[is.na(LEVELS)]) = TRUE y[which(LEVELS %in% class.in.ids)] = 1 y[which(LEVELS %in% class.lig.ids)] = 2 y[which(LEVELS %in% class.mvpa.ids)] = 3 @@ -120,29 +132,26 @@ g.fragmentation = function(frag.metrics = c("mean", "TP", "Gini", "power", out = TransProb(y, a = 1, b = 3) #IN to MVPA output[["TP_IN2MVPA"]] = out$TPab output[["Nfrag_IN2MVPA"]] = out$Nab - + out = TransProb(y, a = 2, b = c(1, 3)) #LIPA <-> rest output[["Nfrag_LIPA"]] = out$Nab output[["mean_dur_LIPA"]] = out$totDur_ab / out$Nab - + out = TransProb(y, a = 3, b = c(1, 2)) #MVPA <-> rest output[["Nfrag_MVPA"]] = out$Nab output[["mean_dur_MVPA"]] = out$totDur_ab / out$Nab } } - #==================================================== # Binary fragmentation for the metrics that do not depend on multiple classes if (mode == "day") { x = rep(0,Nepochs) + is.na(x[is.na(LEVELS)]) = TRUE x[which(LEVELS %in% class.in.ids)] = 1 # inactivity becomes 1 because this is behaviour of interest x = as.integer(x) - - frag2levels = rle(x) - Nfrag2levels = length(frag2levels$lengths) - + Nfrag2levels = length(which(is.na(frag2levels$values) == FALSE)) out = TransProb(x, a = 1, b = 0) #IN <-> PA output[["Nfrag_PA"]] = out$Nba output[["Nfrag_IN"]] = out$Nab @@ -178,7 +187,6 @@ g.fragmentation = function(frag.metrics = c("mean", "TP", "Gini", "power", # Identify to metric named fragmentation by Chastin, # but renamed into Number of Fragments Per Minutes to be # a better reflection of the calculation - output[["NFragPM_PA"]] = output[["Nfrag_PA"]] / out$totDur_ba output[["NFragPM_IN"]] = output[["Nfrag_IN"]] / out$totDur_ab } @@ -188,8 +196,8 @@ g.fragmentation = function(frag.metrics = c("mean", "TP", "Gini", "power", if (Nfrag2levels >= 10) { DurationIN = frag2levels$length[which(frag2levels$value == 1)] DurationPA = frag2levels$length[which(frag2levels$value == 0)] - SD0 = sd(DurationPA) - SD1 = sd(DurationIN) + SD0 = sd(DurationPA, na.rm = TRUE) + SD1 = sd(DurationIN, na.rm = TRUE) output[["SD_dur_PA"]] = SD0 # maybe not a fragmentation metric, but helpful to understand other metrics output[["SD_dur_IN"]] = SD1 if ("Gini" %in% frag.metrics) { @@ -219,11 +227,11 @@ g.fragmentation = function(frag.metrics = c("mean", "TP", "Gini", "power", } } } - } else if (mode == "spt") { # Binary fragmentation metrics for spt: # Active - Rest transitions during SPT: x = rep(0, Nepochs) + is.na(x[is.na(LEVELS)]) = TRUE # convert to class names to numeric class ids for inactive, LIPA and MVPA: classes.pa = c("spt_wake_LIG", "spt_wake_MOD", "spt_wake_VIG") class.pa = which(Lnames %in% classes.pa) - 1 @@ -238,6 +246,7 @@ g.fragmentation = function(frag.metrics = c("mean", "TP", "Gini", "power", output[["TP_PA2IN"]] = out$TPba # Wake - Sleep transitions during SPT: x = rep(0, Nepochs) + is.na(x[is.na(LEVELS)]) = TRUE classes.wake = c("spt_wake_IN", "spt_wake_LIG", "spt_wake_MOD", "spt_wake_VIG") class.wake = which(Lnames %in% classes.wake) - 1 wakei = which(LEVELS %in% class.wake) diff --git a/R/g.impute.R b/R/g.impute.R index 560924f84..ec4073134 100644 --- a/R/g.impute.R +++ b/R/g.impute.R @@ -354,16 +354,18 @@ g.impute = function(M, I, params_cleaning = c(), desiredtz = "", for (mi in 2:ncol(metashort)) {# generate 'average' day for each variable # The average day is used for imputation and defined relative to the starttime of the measurement # irrespective of dayborder as used in other parts of GGIR - metrimp = metr = as.numeric(as.matrix(metashort[, mi])) + metr = as.numeric(as.matrix(metashort[, mi])) is.na(metr[which(r5long != 0)]) = T #turn all values of metr to na if r5long is different to 0 (it now leaves the expanded time with expand_tail_max out of the averageday calculation) imp = matrix(NA,wpd,ceiling(length(metr)/wpd)) #matrix used for imputation of seconds ndays = ncol(imp) #number of days (rounded upwards) nvalidsec = matrix(0,wpd,1) dcomplscore = length(which(r5 == 0)) / length(r5) if (ndays > 1 ) { # only do imputation if there is more than 1 day of data #& length(which(r5 == 1)) > 1 + # all days except last one for (j in 1:(ndays - 1)) { imp[,j] = as.numeric(metr[(((j - 1)*wpd) + 1):(j*wpd)]) } + # last day lastday = metr[(((ndays - 1)*wpd) + 1):length(metr)] imp[1:length(lastday),ndays] = as.numeric(lastday) imp3 = rowMeans(imp, na.rm = TRUE) @@ -384,8 +386,9 @@ g.impute = function(M, I, params_cleaning = c(), desiredtz = "", imp[missing,j] = imp3[missing] } } + # imp is now the imputed time series dim(imp) = c(length(imp),1) - # imp = imp[-c(which(is.na(as.numeric(as.character(imp))) == T))] + # but do not use imp for expanded time toimpute = which(r5long != -1) # do not impute the expanded time with expand_tail_max_hours metashort[toimpute, mi] = as.numeric(imp[toimpute]) #to cut off the latter part of the last day used as a dummy data } else { diff --git a/R/g.part2.R b/R/g.part2.R index c60b693fb..96c687a39 100644 --- a/R/g.part2.R +++ b/R/g.part2.R @@ -302,7 +302,7 @@ g.part2 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), "g.extractheadervars", "g.analyse.avday", "g.getM5L5", "g.IVIS", "g.analyse.perday", "g.getbout", "g.analyse.perfile", "g.intensitygradient", "iso8601chartime2POSIX", "extract_params", "load_params", "check_params", - "correctOlderMilestoneData", "cosinorAnalyses", "extractID") + "correctOlderMilestoneData", "cosinor_IS_IV_Analyses", "extractID") errhand = 'stop' } i = 0 # declare i because foreach uses it, without declaring it diff --git a/R/g.part4.R b/R/g.part4.R index 24fcbbfe8..00907136a 100644 --- a/R/g.part4.R +++ b/R/g.part4.R @@ -363,7 +363,7 @@ g.part4 = function(datadir = c(), metadatadir = c(), f0 = f0, f1 = f1, acc_available = TRUE #default assumption # initialize dataframe to hold sleep period overview: spocum = data.frame(nb = numeric(0), start = numeric(0), end = numeric(0), - dur = numeric(0), def = character(0)) + overlapGuider = numeric(0), def = character(0)) spocumi = 1 # counter for sleep periods # continue now with the specific data of the night diff --git a/R/g.part5.R b/R/g.part5.R index 8e1dc561b..bd7e7a560 100644 --- a/R/g.part5.R +++ b/R/g.part5.R @@ -108,6 +108,7 @@ g.part5 = function(datadir = c(), metadatadir = c(), f0=c(), f1=c(), extractfilenames, referencefnames, folderstructure, fullfilenames, foldername, ffdone, verbose) { tail_expansion_log = NULL + filename_dir = NULL # to be loaded fnames.ms1 = dir(paste(metadatadir, "/meta/basic", sep = "")) fnames.ms2 = dir(paste(metadatadir, "/meta/ms2.out", sep = "")) fnames.ms4 = dir(paste(metadatadir, "/meta/ms4.out", sep = "")) @@ -600,11 +601,12 @@ g.part5 = function(datadir = c(), metadatadir = c(), f0=c(), f1=c(), rawlevels_fname = rawlevels_fname, DaCleanFile = DaCleanFile, includedaycrit.part5 = params_cleaning[["includedaycrit.part5"]], + includenightcrit.part5 = params_cleaning[["includenightcrit.part5"]], ID = ID, params_output = params_output, params_247 = params_247, - filename = filename, - timewindow = timewindowi) + Lnames = Lnames, timewindow = timewindowi, + filename = filename) } } } diff --git a/R/g.part5.savetimeseries.R b/R/g.part5.savetimeseries.R index dd1e5182c..33abd56a6 100644 --- a/R/g.part5.savetimeseries.R +++ b/R/g.part5.savetimeseries.R @@ -1,10 +1,12 @@ g.part5.savetimeseries = function(ts, LEVELS, desiredtz, rawlevels_fname, DaCleanFile = NULL, - includedaycrit.part5 = 2/3, ID = NULL, + includedaycrit.part5 = 2/3, + includenightcrit.part5 = 0, + ID = NULL, params_output, params_247 = NULL, - filename = "", - timewindow = NULL) { + Lnames = NULL, timewindow = NULL, + filename = "") { ms5rawlevels = data.frame(date_time = ts$time, class_id = LEVELS, # class_name = rep("",Nts), @@ -51,24 +53,23 @@ g.part5.savetimeseries = function(ts, LEVELS, desiredtz, rawlevels_fname, # Add invalid day indicator mdat$invalid_wakinghours = mdat$invalid_sleepperiod = mdat$invalid_fullwindow = 100 - wakeup = which(diff(c(mdat$SleepPeriodTime,0)) == -1) + 1 # first epoch of each day - if (length(wakeup) > 0) { - if (length(wakeup) > 1) { - for (di in 1:(length(wakeup) - 1)) { - dayindices = wakeup[di]:(wakeup[di + 1] - 1) - wake = which(mdat$SleepPeriodTime[dayindices] == 0) - sleep = which(mdat$SleepPeriodTime[dayindices] == 1) - mdat$invalid_wakinghours[dayindices] = round(mean(mdat$invalidepoch[dayindices[wake]]) * 100, digits = 2) - mdat$invalid_sleepperiod[dayindices] = round(mean(mdat$invalidepoch[dayindices[sleep]]) * 100, digits = 2) - mdat$invalid_fullwindow[dayindices] = round(mean(mdat$invalidepoch[dayindices]) * 100, digits = 2) + window_starts = which(abs(diff(c(0, mdat$window, 0))) > 0) # first epoch of each window + if (length(window_starts) > 0) { + if (length(window_starts) > 1) { + for (di in 1:(length(window_starts) - 1)) { + window_indices = window_starts[di]:pmin((window_starts[di + 1] - 1), nrow(mdat)) + wake = which(mdat$SleepPeriodTime[window_indices] == 0) + sleep = which(mdat$SleepPeriodTime[window_indices] == 1) + mdat$invalid_wakinghours[window_indices] = round(mean(mdat$invalidepoch[window_indices[wake]]) * 100, digits = 2) + mdat$invalid_sleepperiod[window_indices] = round(mean(mdat$invalidepoch[window_indices[sleep]]) * 100, digits = 2) + mdat$invalid_fullwindow[window_indices] = round(mean(mdat$invalidepoch[window_indices]) * 100, digits = 2) } } else { - dayindices = 1:nrow(mdat) - wake = which(mdat$SleepPeriodTime[dayindices] == 0) - sleep = which(mdat$SleepPeriodTime[dayindices] == 1) - mdat$invalid_wakinghours[dayindices] = round(mean(mdat$invalidepoch[dayindices[wake]]) * 100, digits = 2) - mdat$invalid_sleepperiod[dayindices] = round(mean(mdat$invalidepoch[dayindices[sleep]]) * 100, digits = 2) - mdat$invalid_fullwindow[dayindices] = round(mean(mdat$invalidepoch[dayindices]) * 100, digits = 2) + wake = which(mdat$SleepPeriodTime == 0) + sleep = which(mdat$SleepPeriodTime == 1) + mdat$invalid_wakinghours = round(mean(mdat$invalidepoch[wake]) * 100, digits = 2) + mdat$invalid_sleepperiod = round(mean(mdat$invalidepoch[sleep]) * 100, digits = 2) + mdat$invalid_fullwindow = round(mean(mdat$invalidepoch) * 100, digits = 2) } # round acceleration values to 3 digits to reduce storage space mdat$ACC = round(mdat$ACC, digits = 3) @@ -91,9 +92,19 @@ g.part5.savetimeseries = function(ts, LEVELS, desiredtz, rawlevels_fname, } else if (includedaycrit.part5 > 1 & includedaycrit.part5 <= 25) { # if includedaycrit.part5 is used like includedaycrit as a number of hours includedaycrit.part5 = (includedaycrit.part5 / 24) * 100 } + # Reformat includenightcrit.part5 to maximum percentage non-wear during waking hours: + if (includenightcrit.part5 >= 0 & includenightcrit.part5 <= 1) { # if includenightcrit.part5 is used as a ratio + includenightcrit.part5 = includenightcrit.part5 * 100 + } else if (includenightcrit.part5 > 1 & includenightcrit.part5 <= 25) { # if includenightcrit.part5 is used like includedaycrit as a number of hours + includenightcrit.part5 = (includenightcrit.part5 / 24) * 100 + } maxpernwday = 100 - includedaycrit.part5 + maxpernwnight = 100 - includenightcrit.part5 # Exclude days that have 100% nonwear over the full window or 100% over wakinghours - cut = which(mdat$invalid_fullwindow == 100 | mdat$invalid_wakinghours > maxpernwday) + cut = which(mdat$invalid_fullwindow == 100 | + mdat$invalid_wakinghours > maxpernwday | + mdat$invalid_sleepperiod > maxpernwnight | + mdat$window == 0) if (length(cut) > 0) mdat = mdat[-cut,] # remove days from which we already know that they are not going to be included (first and last day) } mdat$guider = ifelse(mdat$guider == 'sleeplog', yes = 1, # digitize guider to save storage space @@ -118,7 +129,7 @@ g.part5.savetimeseries = function(ts, LEVELS, desiredtz, rawlevels_fname, mdat$timestamp = as.POSIXct(mdat$timenum, origin = "1970-01-01",tz = desiredtz) rawlevels_fname = gsub(pattern = ".csv", replacement = ".RData", x = rawlevels_fname) fname = unique(rawlevels_fname[grep("*RData$", rawlevels_fname)]) - save(mdat, filename, file = fname) + save(mdat, filename, Lnames, file = fname) } #=============================== rm(mdat) diff --git a/R/g.part5_analyseSegment.R b/R/g.part5_analyseSegment.R index f806da8a2..ae45fc51b 100644 --- a/R/g.part5_analyseSegment.R +++ b/R/g.part5_analyseSegment.R @@ -234,101 +234,6 @@ g.part5_analyseSegment = function(indexlog, timeList, levelList, dsummary[si, fi] = quantile(ts$ACC[sse],probs = ((WLH - 0.5)/WLH), na.rm = TRUE) ds_names[fi] = paste("quantile_mostactive30min_mg", sep = ""); fi = fi + 1 #=============================================== - # L5 M5, L10 M10... - for (wini in params_247[["winhr"]]) { - reso = params_247[["M5L5res"]] #resolution at 5 minutes - endd = floor(WLH * 10) / 10 # rounding needed for non-integer window lengths - nwindow_f = (endd - wini) #number of windows for L5M5 analyses - ignore = FALSE - if (endd <= wini | nwindow_f < 1) ignore = TRUE # day is shorter then time window, so ignore this - nwindow_f = nwindow_f * (60/reso) - if (ignore == FALSE) { - # Calculate running window variables - ACCrunwin = matrix(0, nwindow_f, 1) - TIMErunwin = matrix("", nwindow_f, 1) - for (hri in 0:floor((((endd - wini) * (60/reso)) - 1))) { - e1 = (hri * reso * (60/ws3new)) + 1 - e2 = (hri + (wini * (60/reso))) * reso * (60/ws3new) - if (e2 > length(sse)) e2 = length(sse) - ACCrunwin[(hri + 1), 1] = mean(ts$ACC[sse[e1:e2]]) - TIMErunwin[(hri + 1), 1] = format(ts$time[sse[e1]]) - } - ACCrunwin = ACCrunwin[is.na(ACCrunwin) == F] - TIMErunwin = TIMErunwin[is.na(ACCrunwin) == F] - if (length(ACCrunwin) > 0 & length(TIMErunwin) > 0) { - # Derive day level variables - L5HOUR = TIMErunwin[which(ACCrunwin == min(ACCrunwin))[1]] - L5VALUE = min(ACCrunwin) - M5HOUR = TIMErunwin[which(ACCrunwin == max(ACCrunwin))[1]] - M5VALUE = max(ACCrunwin) - if (lightpeak_available == TRUE) { - if (length(unlist(strsplit(M5HOUR, " |T"))) == 1) M5HOUR = paste0(M5HOUR, " 00:00:00") - startM5 = which(format(ts$time) == M5HOUR) - M5_mean_peakLUX = round(mean(ts$lightpeak[startM5[1]:(startM5[1] + (wini*60*(60/ws3new)))], na.rm = TRUE), digits = 1) - M5_max_peakLUX = round(max(ts$lightpeak[startM5[1]:(startM5[1] + (wini*60*(60/ws3new)))], na.rm = TRUE), digits = 1) - } - } else { - L5HOUR = M5HOUR = "not detected" - L5VALUE = M5VALUE = "" - if (lightpeak_available == TRUE) { - M5_mean_peakLUX = M5_max_peakLUX = "" - } - } - } - # Add variables calculated above to the output matrix - if (ignore == FALSE) { - dsummary[si, fi:(fi + 3)] = c(L5HOUR, L5VALUE, M5HOUR, M5VALUE) - } - ds_names[fi:(fi + 3)] = c(paste0("L", wini, "TIME"), - paste0("L", wini, "VALUE"), - paste0("M", wini, "TIME"), - paste0("M", wini, "VALUE")) - fi = fi + 4 - if ("lightpeak" %in% colnames(ts)) { - if (ignore == FALSE) { - dsummary[si,fi] = M5_mean_peakLUX - dsummary[si,fi + 1] = M5_max_peakLUX - } - ds_names[fi] = paste("M", wini, "_mean_peakLUX", sep = "") - ds_names[fi + 1] = paste("M", wini,"_max_peakLUX", sep = "") - fi = fi + 2 - } - if (ignore == FALSE) { - # Add also numeric time - if (is.ISO8601(L5HOUR)) { # only do this for ISO8601 format - L5HOUR = format(iso8601chartime2POSIX(L5HOUR, tz = params_general[["desiredtz"]])) - M5HOUR = format(iso8601chartime2POSIX(M5HOUR, tz = params_general[["desiredtz"]])) - } - if (length(unlist(strsplit(L5HOUR," "))) == 1) L5HOUR = paste0(L5HOUR," 00:00:00") #added because on some OS timestamps are deleted for midnight - if (length(unlist(strsplit(M5HOUR," "))) == 1) M5HOUR = paste0(M5HOUR," 00:00:00") - if (L5HOUR != "not detected") { - time_num = sum(as.numeric(unlist(strsplit(unlist(strsplit(L5HOUR," "))[2], ":"))) * c(3600, 60, 1)) / 3600 - if (!is.null(dayofinterest) && length(sumSleep$daysleeper[dayofinterest]) == 1) { - daysleeper_value = sumSleep$daysleeper[dayofinterest] - } else { - daysleeper_value = 0 - } - if ((time_num < 12 & daysleeper_value == 0) | - (time_num < 18 & daysleeper_value == 1)) { - time_num = time_num + 24 - } - dsummary[si,fi] = time_num - } else { - dsummary[si,fi] = NA - } - } - ds_names[fi] = paste("L", wini, "TIME_num", sep = ""); fi = fi + 1 - if (ignore == FALSE) { - if (M5HOUR != "not detected") { - time_num = sum(as.numeric(unlist(strsplit(unlist(strsplit(M5HOUR," "))[2], ":"))) * c(3600, 60, 1)) / 3600 - dsummary[si, fi] = time_num - } else { - dsummary[si, fi] = NA - } - } - ds_names[fi] = paste("M", wini, "TIME_num", sep = ""); fi = fi + 1 - } - #=============================================== NANS = which(is.nan(dsummary[si,]) == TRUE) #average of no values will results in NaN if (length(NANS) > 0) dsummary[si,NANS] = "" #=============================================== @@ -424,15 +329,16 @@ g.part5_analyseSegment = function(indexlog, timeList, levelList, #=============================================== # FRAGMENTATION if (length(params_phyact[["frag.metrics"]]) > 0) { - for (fragmode in c("day", "spt")) { - frag.out = g.fragmentation(frag.metrics = params_phyact[["frag.metrics"]], - LEVELS = LEVELS[sse[ts$diur[sse] == ifelse(fragmode == "day", 0, 1)]], - Lnames = Lnames, xmin = 60/ws3new, mode = fragmode) - # fragmentation values can come with a lot of decimal places - dsummary[si, fi:(fi + (length(frag.out) - 1))] = round(as.numeric(frag.out), digits = 6) - ds_names[fi:(fi + (length(frag.out) - 1))] = paste0("FRAG_", names(frag.out), "_", fragmode) - fi = fi + length(frag.out) - } + # Only do daytime fragmentation and not spt window because in part 5 the spt window does + # not necessarily include valid data. spt window fragmentation will be performed in part 6 + fragmode = "day" + frag.out = g.fragmentation(frag.metrics = params_phyact[["frag.metrics"]], + LEVELS = LEVELS[sse[ts$diur[sse] == ifelse(fragmode == "day", 0, 1)]], + Lnames = Lnames, xmin = 60/ws3new, mode = fragmode) + # fragmentation values can come with a lot of decimal places + dsummary[si, fi:(fi + (length(frag.out) - 1))] = round(as.numeric(frag.out), digits = 6) + ds_names[fi:(fi + (length(frag.out) - 1))] = paste0("FRAG_", names(frag.out), "_", fragmode) + fi = fi + length(frag.out) } #=============================================== # LIGHT, IF AVAILABLE diff --git a/R/g.part6.R b/R/g.part6.R index 9f3fa1d69..3023a2c06 100644 --- a/R/g.part6.R +++ b/R/g.part6.R @@ -1,5 +1,6 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), params_general = c(), params_phyact = c(), params_247 = c(), + params_cleaning = c(), verbose = TRUE, ...) { # This function called by function GGIR @@ -106,26 +107,79 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), skip = 0 } if (params_general[["overwrite"]] == TRUE) skip = 0 - + Lnames = cosinor_coef = mdat = NULL if (skip == 0) { # Load time series: if (EXT == "RData") { load(file = paste0(metadatadir, "/meta/ms5.outraw/", params_phyact[["part6_threshold_combi"]], "/", fnames.ms5raw[i])) + if (is.null(Lnames)) stop("Part 5 was processed with an older version of GGIR, reprocess part 5") mdat$time = mdat$timestamp # duplicate column because cosinor function expect columntime } else { + # use behavioural codes files to derive Lnames object (names of behavioural classes) + legendfile = dir(path = paste0(metadatadir, "/meta/ms5.outraw"), pattern = "behavioralcodes", full.names = TRUE) + if (length(legendfile) > 1) { + df <- file.info(legendfile) + legendfile = rownames(df)[which.max(df$mtime)] + } else if (length(legendfile) == 0) { + stop(paste0("behaviouralcodes file could not be found in ", + paste0(metadatadir, "/meta/ms5.outraw"))) + } + Lnames = data.table::fread(file = legendfile)$class_name mdat = data.table::fread(file = paste0(metadatadir, "/meta/ms5.outraw/", params_phyact[["part6_threshold_combi"]], "/", fnames.ms5raw[i]), data.table = FALSE) filename = fnames.ms5raw[i] } - nfeatures = 50 + nfeatures = 200 summary = matrix(NA, nfeatures, 1) s_names = rep("", nfeatures) fi = 1 - epochSize = diff(mdat$timenum[1:2]) + deltatime = diff(mdat$timenum) + epochSize = min(deltatime[which(deltatime != 0)]) + if (any(deltatime > epochSize)) { + imputeTimeGaps = function(mdat, epochSize) { + # impute timegaps + mdat$gap = 1 + mdat$gap[1:(nrow(mdat) - 1)] = diff(mdat$timenum) / epochSize + gapp = which(mdat$gap != 1) + if (length(gapp) > 0) { + if (gapp[1] > 1) { + newTime = mdat$timenum[1:(gapp[1] - 1)] + } else { + newTime = NULL + } + for (g in 1:length(gapp)) { + newTime = c(newTime, mdat$timenum[gapp[g]] + seq(0, by = epochSize, length.out = mdat$gap[gapp[g]])) + if (g < length(gapp)) { + newTime = c(newTime, mdat$timenum[(gapp[g] + 1):(gapp[g + 1] - 1)]) + } + } + newTime = c(newTime, mdat$timenum[(gapp[g] + 1):length(mdat$timenum)]) + } + mdat <- as.data.frame(lapply(mdat, rep, mdat$gap)) + invalid = duplicated(mdat) + if (length(gapp) > 0) { + mdat$timenum = newTime[1:nrow(mdat)] + } + mdat = mdat[, which(colnames(mdat) != "gap")] + mdat$ACC[invalid] = NA + mdat$class_id[invalid] = NA + mdat$guider[invalid] = NA + mdat$invalidepoch[invalid] = 1 + mdat$window[invalid] = 9999 + mdat$invalid_sleepperiod[invalid] = 100 + mdat$invalid_wakinghours[invalid] = 100 + mdat$time = mdat$timestamp = as.POSIXct(mdat$timenum, tz = params_general[["desiredtz"]]) + return(mdat) + } + mdat = imputeTimeGaps(mdat, epochSize) + } # Select relevant section of the time series - wakeuptimes = which(diff(c(1, mdat$SleepPeriodTime)) == -1) - onsettimes = which(diff(c(1, mdat$SleepPeriodTime)) == 1) + wakeuptimes = which(diff(c(1, mdat$SleepPeriodTime, 0)) == -1) + onsettimes = which(diff(c(0, mdat$SleepPeriodTime, 1)) == 1) + Nmdat = nrow(mdat) + if (wakeuptimes[length(wakeuptimes)] > Nmdat) wakeuptimes[length(wakeuptimes)] = Nmdat + if (onsettimes[length(onsettimes)] > Nmdat) onsettimes[length(onsettimes)] = Nmdat getIndex = function(x, ts, wakeuptimes, onsettimes, epochSize) { if (x == "start") { y = 1 # simply the start of the time series @@ -136,14 +190,14 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), if (ni > 0) { # nth Wakeup after the start y = wakeuptimes[ni] } else if (ni < 0) { # nth Wakeup before the end - y = wakeuptimes[length(wakeuptimes) + ni] # + because it already is - + y = wakeuptimes[length(wakeuptimes) + ni + 1] # + because it already is - } } else if (substr(x, start = 1, stop = 1) == "O") { ni = as.numeric(substr(x, start = 2, stop = 5)) if (ni > 0) { # nth Onset after the start y = onsettimes[ni] } else if (ni < 0) { # nth Onset before the end - y = onsettimes[length(wakeuptimes) + ni] # + because it already is - + y = onsettimes[length(onsettimes) + ni + 1] # + because it already is - } } else if (substr(x, start = 1, stop = 1) == "H") { ni = as.numeric(substr(x, start = 2, stop = 5)) @@ -166,21 +220,20 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), if (do.cr == TRUE) { ts = mdat[t0:t1, ] rm(mdat) - # extract nightsi again - tempp = unclass(as.POSIXlt(ts$time, tz = params_general[["desiredtz"]])) - sec = tempp$sec - min = tempp$min - hour = tempp$hour - if (params_general[["dayborder"]] == 0) { - nightsi = which(sec == 0 & min == 0 & hour == 0) - } else { - nightsi = which(sec == 0 & min == (params_general[["dayborder"]] - floor(params_general[["dayborder"]])) * 60 & hour == floor(params_general[["dayborder"]])) #shift the definition of midnight if required - } ts = ts[which(ts$window != 0), ] } else { ts = mdat[1,] } + # Set windows to missing that do not have enough valid data in either spt or day. + # Note that columns invalid_wakinghours and invalid_sleepperiod here + # are constants per window, we use this to identify the entire window as valid/invalid + invalidWindows = which(ts$invalid_wakinghours > (1 - params_cleaning[["includecrit.part6"]][1]) * 100 | + ts$invalid_sleepperiod > (1 - params_cleaning[["includecrit.part6"]][2]) * 100) + if (length(invalidWindows) > 0) { + ts[invalidWindows,c("class_id", "ACC")] = NA + ts$invalidepoch[invalidWindows] = 1 + } # Include basic information in the summary summary[fi] = unlist(strsplit(fnames.ms5raw[i], "_"))[1] s_names[fi] = "ID" @@ -197,30 +250,64 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), no = nrow(ts) / ((3600 * 24) / epochSize)) s_names[fi] = "N_days" fi = fi + 1 - summary[fi] = ifelse(test = nrow(ts) == 1, - yes = 0, - no = length(which(ts$invalidepoch == 0)) / ((3600 * 24) / epochSize)) + N_valid_days = ifelse(test = nrow(ts) == 1, + yes = 0, + no = length(which(ts$invalidepoch == 0)) / ((3600 * 24) / epochSize)) + summary[fi] = N_valid_days s_names[fi] = "N_valid_days" fi = fi + 1 - - # Cosinor analysis - if (do.cr == TRUE) { + CountWeekDays = table(weekdays(ts$time[which(ts$invalidepoch == 0)])) + CountWeekDays = CountWeekDays / (1440 * (60 / epochSize)) + CountWeekDays = CountWeekDays[which(CountWeekDays > 0.66)] + if (length(CountWeekDays) > 0) { + Nweekendays = length(which(names(CountWeekDays) %in% c("Saturday", "Sunday"))) + } else { + Nweekendays = 0 + } + summary[fi] = Nweekendays + s_names[fi] = "N_valid_weekend_days" + fi = fi + 1 + #================================================================= + # Circadian rhythm analysis + if (do.cr == TRUE & N_valid_days > 2) { + #=============================================== + # Cosinor analysis which comes with IV IS estimtes + # Note: applyCosinorAnalyses below uses column invalidepoch to turn + # imputed values to NA. colnames(ts)[which(colnames(ts) == "timenum")] = "time" acc4cos = ts[, c("time", "ACC")] - acc4cos$ACC = acc4cos$ACC / 1000 # convert to mg because that is what applyCosinorAnalyses expects - cosinor_coef = applyCosinorAnalyses(ts = acc4cos, - qcheck = ts$invalidepoch, - midnightsi = nightsi, - epochsizes = rep(epochSize, 2)) + qcheck = ts$invalidepoch + + # Add missing value to complete an interger number of full days + Na4c = nrow(acc4cos) + NepochsPerDay = 24 * (3600 / epochSize) + NepochsNeeded = ((ceiling(Na4c / NepochsPerDay) + 4) * NepochsPerDay - Na4c) + 1 + if (NepochsNeeded > 0) { + acc4cos[(Na4c + 1):(Na4c + NepochsNeeded), ] = NA + acc4cos$time[Na4c:(Na4c + NepochsNeeded)] = seq(from = acc4cos$time[Na4c], by = epochSize, length.out = NepochsNeeded + 1) + qcheck = c(qcheck, rep(1, NepochsNeeded)) + } + + threshold = as.numeric(unlist(strsplit( params_phyact[["part6_threshold_combi"]], "_"))[1]) + + # extract nightsi again + tempp = unclass(as.POSIXlt(acc4cos$time, tz = params_general[["desiredtz"]])) + sec = tempp$sec + min = tempp$min + hour = tempp$hour + if (params_general[["dayborder"]] == 0) { + nightsi = which(sec == 0 & min == 0 & hour == 0) + } else { + nightsi = which(sec == 0 & min == (params_general[["dayborder"]] - floor(params_general[["dayborder"]])) * 60 & hour == floor(params_general[["dayborder"]])) #shift the definition of midnight if required + } + cosinor_coef = apply_cosinor_IS_IV_Analyses(ts = acc4cos, + qcheck = qcheck, + midnightsi = nightsi, + epochsizes = rep(epochSize, 2), + threshold = threshold) rm(acc4cos) - } else { - cosinor_coef = NULL - } - if (length(cosinor_coef) > 0) { - summary[fi] = cosinor_coef$timeOffsetHours - s_names[fi] = "cosinor_timeOffsetHours" - fi = fi + 1 - try(expr = {summary[fi:(fi + 5)] = c(cosinor_coef$coef$params$mes, + try(expr = {summary[fi:(fi + 6)] = c(cosinor_coef$timeOffsetHours, + cosinor_coef$coef$params$mes, cosinor_coef$coef$params$amp, cosinor_coef$coef$params$acr, cosinor_coef$coef$params$acrotime, @@ -228,9 +315,9 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), cosinor_coef$coef$params$R2)}, silent = TRUE) - s_names[fi:(fi + 5)] = c("cosinor_mes", "cosinor_amp", "cosinor_acrophase", + s_names[fi:(fi + 6)] = c("cosinor_timeOffsetHours", "cosinor_mes", "cosinor_amp", "cosinor_acrophase", "cosinor_acrotime", "cosinor_ndays", "cosinor_R2") - fi = fi + 6 + fi = fi + 7 try(expr = {summary[fi:(fi + 10)] = c(cosinor_coef$coefext$params$minimum, cosinor_coef$coefext$params$amp, cosinor_coef$coefext$params$alpha, @@ -248,22 +335,173 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), "cosinorExt_DownMesor", "cosinorExt_MESOR", "cosinorExt_ndays", "cosinorExt_F_pseudo", "cosinorExt_R2") fi = fi + 11 - summary[fi:(fi + 1)] = c(cosinor_coef$IVIS$InterdailyStability, - cosinor_coef$IVIS$IntradailyVariability) - s_names[fi:(fi + 1)] = c("cosinorIS", "cosinorIV") - fi = fi + 2 - } else { - cosinor_coef = c() - s_names[fi:(fi + 19)] = c("cosinor_timeOffsetHours", "cosinor_mes", - "cosinor_amp", "cosinor_acrophase", - "cosinor_acrotime", "cosinor_ndays", "cosinor_R2", - "cosinorExt_minimum", "cosinorExt_amp", - "cosinorExt_alpha", "cosinorExt_beta", - "cosinorExt_acrotime", "cosinorExt_UpMesor", - "cosinorExt_DownMesor", "cosinorExt_MESOR", - "cosinorExt_ndays", "cosinorExt_F_pseudo", - "cosinorExt_R2", "cosinorIS", "cosinorIV") - fi = fi + 20 + try(expr = {summary[fi:(fi + 2)] = c(cosinor_coef$IVIS$InterdailyStability, + cosinor_coef$IVIS$IntradailyVariability, + cosinor_coef$IVIS$phi)}, + silent = TRUE) + s_names[fi:(fi + 2)] = c("IS", "IV", "phi") + fi = fi + 3 + #=============================================== + # Transition probabilities + for (fragmode in c("day", "spt")) { + ts_temp = ts + # turn other half of data to NA + ts_temp$class_id[which(ts$SleepPeriodTime == ifelse(fragmode == "spt", yes = 0, no = 1))] = NA + frag.out = g.fragmentation(frag.metrics = params_phyact[["frag.metrics"]], + LEVELS = ts_temp$class_id, + Lnames = Lnames, xmin = 60/epochSize, mode = fragmode) + # fragmentation values can come with a lot of decimal places + summary[fi:(fi + (length(frag.out) - 1))] = round(as.numeric(frag.out), digits = 6) + s_names[fi:(fi + (length(frag.out) - 1))] = paste0("FRAG_", names(frag.out), "_", fragmode) + fi = fi + length(frag.out) + } + + #=============================================== + # LXMX: code copied from g.part 5 + # To be refactored as a central generic function once + # new MXLX function is merged + window_number = unique(ts$window) + dsummary = matrix(NA, 100, length(params_247[["winhr"]]) * 80) + ds_names = rep("", ncol(dsummary)) + lightpeak_available = "lightpeak" %in% names(ts) + si = 1 + for (wi in window_number) { + gi = 1 + sse = which(ts$window == wi) + if (is.na(ts$ACC[sse[2]])) { + next + } + ts$time_num = ts$time + ts$time = ts$timestamp + WLH = length(sse)/((60/epochSize) * 60) + if (WLH <= 1) WLH = 1.001 + for (wini in params_247[["winhr"]]) { + reso = params_247[["M5L5res"]] #resolution at 5 minutes + endd = floor((WLH * 10) / 10) # rounding needed for non-integer window lengths + nwindow_f = (endd - wini) #number of windows for L5M5 analyses + ignore = FALSE + if (endd <= wini | nwindow_f < 1) ignore = TRUE # day is shorter then time window, so ignore this + nwindow_f = nwindow_f * (60/reso) + if (ignore == FALSE) { + # Calculate running window variables + ACCrunwin = matrix(0, nwindow_f, 1) + TIMErunwin = matrix("", nwindow_f, 1) + for (hri in 0:floor((((endd - wini) * (60/reso)) - 1))) { + e1 = (hri * reso * (60/epochSize)) + 1 + e2 = (hri + (wini * (60/reso))) * reso * (60/epochSize) + if (e2 > length(sse)) e2 = length(sse) + ACCrunwin[(hri + 1), 1] = mean(ts$ACC[sse[e1:e2]]) + TIMErunwin[(hri + 1), 1] = format(ts$time[sse[e1]]) + } + ACCrunwin = ACCrunwin[is.na(ACCrunwin) == F] + TIMErunwin = TIMErunwin[is.na(ACCrunwin) == F] + if (length(ACCrunwin) > 0 & length(TIMErunwin) > 0) { + # Derive day level variables + L5VALUE = min(ACCrunwin) + L5HOUR = TIMErunwin[which(ACCrunwin == L5VALUE)[1]] + M5VALUE = max(ACCrunwin) + M5HOUR = TIMErunwin[which(ACCrunwin == M5VALUE)[1]] + if (lightpeak_available == TRUE) { + if (length(unlist(strsplit(M5HOUR, " |T"))) == 1) M5HOUR = paste0(M5HOUR, " 00:00:00") + startM5 = which(format(ts$time) == M5HOUR) + M5_mean_peakLUX = round(mean(ts$lightpeak[startM5[1]:(startM5[1] + (wini*60*(60/epochSize)))], na.rm = TRUE), digits = 1) + M5_max_peakLUX = round(max(ts$lightpeak[startM5[1]:(startM5[1] + (wini*60*(60/epochSize)))], na.rm = TRUE), digits = 1) + } + } else { + L5HOUR = M5HOUR = "not detected" + L5VALUE = M5VALUE = "" + if (lightpeak_available == TRUE) { + M5_mean_peakLUX = M5_max_peakLUX = "" + } + } + # Add variables calculated above to the output matrix + dsummary[si, gi:(gi + 1)] = c(L5VALUE, M5VALUE) + } + ds_names[gi:(gi + 1)] = c(paste0("L", wini, "VALUE"), + paste0("M", wini, "VALUE")) + gi = gi + 2 + if ("lightpeak" %in% colnames(ts)) { + if (ignore == FALSE) { + dsummary[si,gi] = M5_mean_peakLUX + dsummary[si,gi + 1] = M5_max_peakLUX + } + ds_names[gi] = paste("M", wini, "_mean_peakLUX", sep = "") + ds_names[gi + 1] = paste("M", wini,"_max_peakLUX", sep = "") + gi = gi + 2 + } + if (ignore == FALSE) { + # Add also numeric time + if (is.ISO8601(L5HOUR)) { # only do this for ISO8601 format + L5HOUR = format(iso8601chartime2POSIX(L5HOUR, tz = params_general[["desiredtz"]])) + M5HOUR = format(iso8601chartime2POSIX(M5HOUR, tz = params_general[["desiredtz"]])) + } + if (length(unlist(strsplit(L5HOUR," "))) == 1) L5HOUR = paste0(L5HOUR," 00:00:00") #added because on some OS timestamps are deleted for midnight + if (length(unlist(strsplit(M5HOUR," "))) == 1) M5HOUR = paste0(M5HOUR," 00:00:00") + if (L5HOUR != "not detected") { + time_num = sum(as.numeric(unlist(strsplit(unlist(strsplit(L5HOUR," "))[2], ":"))) * c(3600, 60, 1)) / 3600 + if (time_num <= 12) { + time_num = time_num + 24 + } + dsummary[si,gi] = time_num + } else { + dsummary[si,gi] = NA + } + } + ds_names[gi] = paste("L", wini, "TIME_num", sep = ""); gi = gi + 1 + if (ignore == FALSE) { + if (M5HOUR != "not detected") { + time_num = sum(as.numeric(unlist(strsplit(unlist(strsplit(M5HOUR," "))[2], ":"))) * c(3600, 60, 1)) / 3600 + dsummary[si, gi] = time_num + } else { + dsummary[si, gi] = NA + } + } + ds_names[gi] = paste("M", wini, "TIME_num", sep = ""); gi = gi + 1 + } + si = si + 1 + if (si > nrow(dsummary)) { + dsummary[nrow(dsummary) + 1, ] = NA + ds_names = c(ds_names, "") + } + } + gi = which(ds_names == "")[1] + dsummary = dsummary[1:(si - 1), 1:(gi - 1), drop = FALSE] + ds_names = ds_names[1:(gi - 1)] + dsummary = as.data.frame(x = dsummary) + colnames(dsummary) = ds_names + # aggregate across recording + MXLXsummary = colMeans(x = dsummary, na.rm = TRUE) + # Add to summary + summary[fi:(fi + length(MXLXsummary) - 1)] = as.numeric(MXLXsummary) + s_names[fi:(fi + length(MXLXsummary) - 1)] = names(MXLXsummary) + fi = fi + length(MXLXsummary) + # Translate times to clock time and add to summary + cols2convert = grep(pattern = "TIME_num", x = names(MXLXsummary), value = FALSE) + if (length(cols2convert) > 0) { + for (cvi in cols2convert) { + s_names[fi] = gsub(pattern = "_num", replacement = "_clock", x = names(MXLXsummary)[cvi]) + hr = as.numeric(floor(MXLXsummary[cvi])) + min = as.numeric(floor((MXLXsummary[cvi] - hr) * 60)) + sec = as.numeric(floor((MXLXsummary[cvi] - hr - (min / 60)) * 3600)) + if (!is.na(hr) && !is.na(min) && !is.na(sec)) { + if (hr >= 24) hr = hr - 24 + clock_time = paste0(hr, ":", min, ":", sec) + } else { + clock_time = NA + } + summary[fi] = clock_time + fi = fi + 1 + } + } + #======================================================================= + # DFA analyses is used independent of do.cr because it is time consuming + if (params_247[["part6DFA"]] == TRUE) { + ssp = SSP(ts$ACC[!is.na(ts$ACC)]) + abi = ABI(ssp) + summary[fi:(fi + 1)] = c(ssp, abi) + s_names[fi:(fi + 1)] = c("SSP", "ABI") + fi = fi + 2 + } } #============================================= # Store results in milestone data @@ -273,18 +511,25 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), s_names = s_names[which(s_names != "")] output_part6 = data.frame(t(summary), stringsAsFactors = FALSE) names(output_part6) = s_names - output_part6[, 4:ncol(output_part6)] = as.numeric(output_part6[, 4:ncol(output_part6)]) + nonnumeric = grep(pattern = "TIME_clock", x = s_names, invert = TRUE) + nonnumeric = nonnumeric[nonnumeric %in% 1:3 == FALSE] + output_part6[, nonnumeric] = as.numeric(output_part6[, nonnumeric]) if (length(output_part6) > 0) { + if (length(cosinor_coef) > 0) { + cosinor_ts = cosinor_coef$coefext$cosinor_ts + } else { + cosinor_ts = c() + } GGIRversion = "GGIR not used" if (is.element('GGIR', installed.packages()[,1])) { GGIRversion = as.character(utils::packageVersion("GGIR")) if (length(GGIRversion) != 1) GGIRversion = sessionInfo()$otherPkgs$GGIR$Version } output_part6$GGIRversion = GGIRversion - save(output_part6, GGIRversion, file = paste0(metadatadir, ms6.out, "/", - gsub(pattern = "[.]csv|[.]RData", - replacement = "", - x = fnames.ms5raw[i]), ".RData")) + save(output_part6, cosinor_ts, GGIRversion, file = paste0(metadatadir, + ms6.out, "/", gsub(pattern = "[.]csv|[.]RData", + replacement = "", + x = fnames.ms5raw[i]), ".RData")) } rm(output_part6, summary) } @@ -304,6 +549,12 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), Ncores2use = min(c(Ncores - 1, params_general[["maxNcores"]], (f1 - f0) + 1)) if (Ncores2use > 1) { cl <- parallel::makeCluster(Ncores2use) # not to overload your computer + parallel::clusterExport(cl = cl, + varlist = c(unclass(lsf.str(envir = asNamespace("GGIR"), all = T)), + "MONITOR", "FORMAT"), + envir = as.environment(asNamespace("GGIR")) + ) + parallel::clusterEvalQ(cl, Sys.setlocale("LC_TIME", "C")) doParallel::registerDoParallel(cl) } else { # Don't process in parallel if only one core @@ -325,7 +576,7 @@ g.part6 = function(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), errhand = 'pass' } else { # pass on functions - functions2passon = c("cosinorAnalyses", "applyCosinorAnalyses", "g.IVIS") + functions2passon = c("cosinor_IS_IV_Analyses", "apply_cosinor_IS_IV_Analyses", "g.IVIS") errhand = 'stop' } i = 0 # declare i because foreach uses it, without declaring it diff --git a/R/g.plot5.R b/R/g.plot5.R index be98df2c3..c951cb334 100644 --- a/R/g.plot5.R +++ b/R/g.plot5.R @@ -143,6 +143,12 @@ g.plot5 = function(metadatadir = c(), dofirstpage = FALSE, viewingwindow = 1, } # Account for shift in nights relative to windows if (nrow(P2daysummary_tmp) > 0) { + # nights missing in sleep summary? + allnights = 1:max(summarysleep_tmp$night) + missingNights = which(allnights %in% summarysleep_tmp$night == FALSE) + if (length(missingNights) > 0) { + n2exclude = sort(unique(c(n2exclude, missingNights))) + } if (P2daysummary_tmp$`N hours`[1] < 24 & P2daysummary_tmp$`N hours`[1] > 12 & length(n2exclude) > 0) { # First calendar day is between 12 and 24 hours # this means that the first night viewindow (=2) may include some data but diff --git a/R/g.report.part4.R b/R/g.report.part4.R index dde793449..6a9d80642 100644 --- a/R/g.report.part4.R +++ b/R/g.report.part4.R @@ -109,7 +109,7 @@ g.report.part4 = function(datadir = c(), metadatadir = c(), loglocation = c(), output = output[-cut, which(colnames(output) != "")] } WW = which(output[, "window"] == "WW") - out = as.matrix(output[WW, which(colnames(output) %in% c("ID", "nonwear_perc_spt", "night_number", "window"))]) + out = as.matrix(output[WW, which(colnames(output) %in% c("ID", "nonwear_perc_spt", "ACC_spt_mg", "night_number", "window"))]) } outputp5 = as.data.frame(do.call(rbind, lapply(fnames.ms5[f0:f1], myfun5)), stringsAsFactors = FALSE) dupl = which(duplicated(outputp5[, c("ID", "night_number")]) == TRUE) @@ -143,6 +143,7 @@ g.report.part4 = function(datadir = c(), metadatadir = c(), loglocation = c(), } nightsummary2 = nightsummary2[order(nightsummary2$ID, nightsummary2$night), ] nightsummary2$nonwear_perc_spt = as.numeric(nightsummary2$nonwear_perc_spt) + nightsummary2$ACC_spt_mg = as.numeric(nightsummary2$ACC_spt_mg) } # ============= skip = FALSE @@ -352,6 +353,11 @@ g.report.part4 = function(datadir = c(), metadatadir = c(), loglocation = c(), personSummarynames = c(personSummarynames, paste("nonwear_perc_spt_", TW, "_mn", sep = "")) cnt = cnt + 1 } + if ("ACC_spt_mg" %in% colnames(nightsummary.tmp)) { + personSummary[i, cnt + 1] = mean(nightsummary.tmp$ACC_spt_mg[this_sleepparam[Seli]], na.rm = TRUE) + personSummarynames = c(personSummarynames, paste("ACC_spt_mg_", TW, "_mn", sep = "")) + cnt = cnt + 1 + } } } nightsummary$cleaningcode = as.numeric(nightsummary$cleaningcode) diff --git a/R/g.report.part5.R b/R/g.report.part5.R index 3b7844039..db2ab667b 100644 --- a/R/g.report.part5.R +++ b/R/g.report.part5.R @@ -26,6 +26,15 @@ g.report.part5 = function(metadatadir = c(), f0 = c(), f1 = c(), loglocation = c includeday_wearPercentage = 0 includeday_absolute = params_cleaning[["includedaycrit.part5"]] * 60 } + if (params_cleaning[["includenightcrit.part5"]] >= 0 & + params_cleaning[["includenightcrit.part5"]] <= 1) { # if includenightcrit.part5 is used as a ratio + includenight_wearPercentage = params_cleaning[["includenightcrit.part5"]] * 100 + includenight_absolute = 0 + } else if (params_cleaning[["includenightcrit.part5"]] > 1 & + params_cleaning[["includenightcrit.part5"]] <= 25) { # if includenightcrit.part5 is used like params_cleaning[["includenightcrit"]] as a number of hours + includenight_wearPercentage = 0 + includenight_absolute = params_cleaning[["includenightcrit.part5"]] * 60 + } include_window = rep(TRUE, nrow(x)) if (length(params_cleaning[["data_cleaning_file"]]) > 0) { # allow for forced relying on guider based on external params_cleaning[["data_cleaning_file"]] DaCleanFile = data.table::fread(params_cleaning[["data_cleaning_file"]], data.table = FALSE) @@ -47,8 +56,12 @@ g.report.part5 = function(metadatadir = c(), f0 = c(), f1 = c(), loglocation = c x$wear_min_day = (1 - (x$nonwear_perc_day / 100)) * x$dur_day_min #valid minute during waking hours x$wear_perc_day = 100 - x$nonwear_perc_day #wear percentage during waking hours + x$wear_min_spt = (1 - (x$nonwear_perc_spt / 100)) * x$dur_spt_min #valid minute during waking hours + x$wear_perc_spt = 100 - x$nonwear_perc_spt #wear percentage during waking hours + x$lastHour = as.numeric(x$lastHour) x$calendar_date = as.Date(x$calendar_date) + minimumValidMinutesMM = 0 # default if (length(params_cleaning[["includedaycrit"]]) == 2) { minimumValidMinutesMM = params_cleaning[["includedaycrit"]][2] * 60 @@ -63,6 +76,8 @@ g.report.part5 = function(metadatadir = c(), f0 = c(), f1 = c(), loglocation = c if (window == "WW" | window == "OO") { indices = which(x$wear_perc_day >= includeday_wearPercentage & x$wear_min_day >= includeday_absolute & + x$wear_perc_spt >= includenight_wearPercentage & + x$wear_min_spt >= includenight_absolute & x$dur_spt_min > 0 & x$dur_day_min > 0 & ((x$lastWindow == TRUE & x$lastHour >= 15 & (x$lastDate - x$calendar_date) >= -1 & window == "WW") | (x$lastWindow == TRUE & x$lastHour >= 9 & (x$lastDate - x$calendar_date) >= -1 & window == "OO") | @@ -72,6 +87,8 @@ g.report.part5 = function(metadatadir = c(), f0 = c(), f1 = c(), loglocation = c } else if (window == "MM") { indices = which(x$wear_perc_day >= includeday_wearPercentage & x$wear_min_day >= includeday_absolute & + x$wear_perc_spt >= includenight_wearPercentage & + x$wear_min_spt >= includenight_absolute & x$dur_spt_min > 0 & x$dur_day_min > 0 & ((x$lastWindow == TRUE & x$lastHour > 9 & (x$lastDate - x$calendar_date) >= -1) | x$lastWindow == FALSE) & diff --git a/R/g.weardec.R b/R/g.weardec.R index 371c061f6..e29466a6a 100644 --- a/R/g.weardec.R +++ b/R/g.weardec.R @@ -1,6 +1,5 @@ -g.weardec = function(M,wearthreshold,ws2, nonWearEdgeCorrection = TRUE) { +g.weardec = function(M,wearthreshold, ws2, nonWearEdgeCorrection = TRUE) { metalong = M$metalong - # eni = which(colnames(metalong) == "en") # Commented out March 9 2023, because it was not used nsi = which(colnames(metalong) == "nonwearscore") csi = which(colnames(metalong) == "clippingscore") NLongEpochs = nrow(metalong) @@ -8,11 +7,9 @@ g.weardec = function(M,wearthreshold,ws2, nonWearEdgeCorrection = TRUE) { # clipsig is a value between 0 and 1 for each long window (default 15 minutes) # and indicates the ratio of datapoints in the window where signal clips LC = length(which(clipsig > 0.05)) - LC2 = length(which(clipsig > 0.8)) - turnoffclip = which(clipsig > 0.8) + turnoffclip = which(clipsig > 0.3) + LC2 = length(turnoffclip) turnoffnonw = which(as.numeric(as.matrix(metalong[,nsi])) >= wearthreshold) - # # some files have zero g when battery drains # Commented out March 9 2023, because it was not used - # turnoffzerog = which(as.numeric(as.matrix(metalong[,eni])) < 0.9) # step 1: identify islands of invalid data r1 = r2 = r3 = matrix(0, nrow(metalong), 1) r1[turnoffnonw] = 1 # non-weartime diff --git a/R/load_params.R b/R/load_params.R index 7889687b8..08c1482d8 100644 --- a/R/load_params.R +++ b/R/load_params.R @@ -76,7 +76,8 @@ load_params = function(topic = c("sleep", "metrics", "rawdata", LUX_cal_constant = c(), LUX_cal_exponent = c(), LUX_day_segments = c(), L5M5window = c(0, 24), cosinor = FALSE, part6CR = FALSE, part6HCA = FALSE, - part6Window = c("start", "end")) + part6Window = c("start", "end"), + part6DFA = FALSE) } if ("phyact" %in% topic) { params_phyact = list(mvpathreshold = 100, boutcriter = 0.8, @@ -103,7 +104,9 @@ load_params = function(topic = c("sleep", "metrics", "rawdata", nonWearEdgeCorrection = TRUE, nonwear_approach = "2023", segmentWEARcrit.part5 = 0.5, segmentDAYSPTcrit.part5 = c(0.9, 0), - study_dates_file = c(), study_dates_dateformat = "%d-%m-%Y") + study_dates_file = c(), study_dates_dateformat = "%d-%m-%Y", + includecrit.part6 = c(2/3, 2/3), + includenightcrit.part5 = 0) } if ("output" %in% topic) { params_output = list(epochvalues2csv = FALSE, save_ms5rawlevels = FALSE, diff --git a/R/tidyup_df.R b/R/tidyup_df.R index e76459382..86d49e7a0 100644 --- a/R/tidyup_df.R +++ b/R/tidyup_df.R @@ -1,12 +1,5 @@ tidyup_df = function(df = c(), digits = 3) { - - myRoundFun = function(x) { - tryCatch(round(as.numeric(as.character(x)), digits = digits), - error = function(cond) return(x), - warning = function(cond) return(x)) - } - df = df[rowSums(!is.na(df)) > 0, ] if (ncol(df) > 1) { # Round columns that can be coerced to numeric @@ -15,14 +8,19 @@ tidyup_df = function(df = c(), digits = 3) { # at 4 or 5 decimal places for (fragCols in c(TRUE, FALSE)) { if (fragCols == TRUE) { - colsID = grep(pattern = "FRAG_", x = colnames(df), invert = TRUE) + colsID = grep(pattern = "FRAG_|ABI|SSP", x = colnames(df), invert = FALSE) digitsRound = 6 } else { - colsID = grep(pattern = "FRAG_", x = colnames(df), invert = FALSE) + colsID = grep(pattern = "FRAG_|ABI|SSP", x = colnames(df), invert = TRUE) digitsRound = digits } + myRoundFun = function(x) { + tryCatch(round(as.numeric(as.character(x)), digits = digitsRound), + error = function(cond) return(x), + warning = function(cond) return(x)) + } if (length(colsID) > 0) { - df[, colsID] = lapply(df[, colsID], myRoundFun) + df[, colsID] = lapply(df[, colsID], FUN = myRoundFun) } } # list to data frame diff --git a/_pkgdown.yml b/_pkgdown.yml index 0b5269ce7..01dc27853 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -67,6 +67,8 @@ navbar: href: articles/chapter12_TimeUseAnalysis.html - text: 13. Circadian Rhythm Analysis href: articles/chapter13_CircadianRhythm.html + - text: 14. Behavioural Fragmentation + href: articles/chapter14_BehaviouralFragmentation.html annexes: text: Annexes menu: @@ -109,8 +111,7 @@ news: href: https://www.accelting.com/updates/ggir-release-2-5-0/ search: - exclude: ['news/index.html', 'articles/GGIR.html', - 'articles/cosinorAnalyses.html', 'articles/appendRecords.html', 'articles/applyCosinorAnalyses.html', + exclude: ['news/index.html', 'reference/cosinorAnalyses.html', 'reference/appendRecords.html', 'reference/applyCosinorAnalyses.html', 'reference/applyExtFunction.html', 'reference/CalcSleepRegularityIndex.html', 'reference/checkMilestoneFolders.html', 'reference/check_log.html', diff --git a/man/ABI.Rd b/man/ABI.Rd new file mode 100644 index 000000000..982b27165 --- /dev/null +++ b/man/ABI.Rd @@ -0,0 +1,33 @@ +\name{ABI} +\alias{ABI} +\title{Activity balance index (ABI)} +\usage{ + ABI(x) +} +\arguments{ + \item{x}{the estimated self-similarity parameter (SSP)} +} +\value{ + The estimated Activity balance index (ABI) is a real number between zero and one. +} +\description{ + This function estimates the Activity balance index (ABI), which is a transformation of the self-similarity parameter (SSP), also known as scaling exponent or alpha. +} +\details{ + ABI = exp(-abs(SSP-1)/exp(-2)) +} +\examples{ + # Estimate Activity balance index of a very known time series + # available on R base: the sunspot.year. + \dontrun{ + ssp = SSP(sunspot.year) + abi = ABI(ssp) + } +} +\references{ + C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger Phys. Rev. E, 49 (1994), p. 1685 + Mesquita, Victor & Filho, Florencio & Rodrigues, Paulo. (2020). Detection of crossover points in detrended fluctuation analysis: An application to EEG signals of patients with epilepsy. Bioinformatics. 10.1093/bioinformatics/btaa955. +} +\author{ + Ian Meneghel Danilevicz +} diff --git a/man/DFA.Rd b/man/DFA.Rd new file mode 100644 index 000000000..2aa38586a --- /dev/null +++ b/man/DFA.Rd @@ -0,0 +1,48 @@ +\name{DFA} +\alias{DFA} +\title{Detrended Fluctuation Analysis} +\description{ + Detrended Fluctuation Analysis (DFA) +} +\usage{ + DFA(data, scale = 2^(1/8), box_size = 4, m = 1) +} +\arguments{ + \item{data}{ + Univariate time series (must be a vector or data frame) + } + \item{scale}{ + Specifies the ratio between successive box sizes (by default scale = 2^(1/8)) + } + \item{box_size}{ + Vector of box sizes (must be used in conjunction with scale = "F") + } + \item{m}{ + An integer of the polynomial order for the detrending (by default m=1) + } +} +\value{ + Estimated alpha is a real number between zero and two. +} +\details{ + The DFA fluctuation can be computed in a geometric scale or for different choices of boxes sizes. +} +\note{ + It is not possible estimating alpha for multiple time series at once. +} +\examples{ + # Estimate self-similarity of a very known time series available + # on R base: the sunspot.year. + # Then the spend time with each method is compared. + \dontrun{ + dfa = DFA(sunspot.year) + } +} +\references{ + C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger Phys. Rev. E, 49 (1994), p. 1685 + Mesquita, Victor & Filho, Florencio & Rodrigues, Paulo. (2020). Detection of crossover points in detrended fluctuation analysis: An application to EEG signals of patients with epilepsy. Bioinformatics. 10.1093/bioinformatics/btaa955. +} +\author{ + Ian Meneghel Danilevicz + Victor Barreto Mesquita +} \ No newline at end of file diff --git a/man/GGIR.Rd b/man/GGIR.Rd index c563edbf1..c301bfda8 100755 --- a/man/GGIR.Rd +++ b/man/GGIR.Rd @@ -903,6 +903,16 @@ GGIR(mode = 1:5, when value above 1 used as absolute number of valid hours required. Do not confuse this argument with argument \code{includedaycrit} which is only used in GGIR part 2 and applies to the entire day.} + + \item{includenightcrit.part5}{ + Numeric (default = 0). + Inclusion criteria used in part 5 for number of valid hours during the + sleep period hours of a day (the night), + when value is smaller than or equal to 1 used as fraction of sleep period hours, + when value above 1 used as absolute number of valid hours required. + Do not confuse this argument with argument \code{includenightcrit} which is + only used in GGIR part 4 and applies to the entire 24 hour window from + noon to noon or 6pm to 6pm.} \item{segmentWEARcrit.part5}{ Numeric (default = 0.5). @@ -962,6 +972,12 @@ GGIR(mode = 1:5, in the centre of the mediumsize window, while in the 2023 method the longsizewindow is aligned with its left edge to the left edge of the mediumsize window. } + + \item{includecrit.part6}{ + Numeric (default = c(2/3, 2/3)) + Vector of two with the minimum fraction of valid data required for day + and spt time, respectively. This criteria is only used for circadian rhythm analysis. + } } } @@ -1333,24 +1349,16 @@ GGIR(mode = 1:5, number of levels.} \item{IVIS_windowsize_minutes}{ - Numeric (default = 60). - Window size of the Intradaily Variability (IV) and Interdaily - Stability (IS) metrics in minutes, needs to be able to add up to 24 hours.} + This argument has been deprecated.} \item{IVIS_epochsize_seconds}{ - Numeric (default = NULL). - This argument is deprecated.} + This argument has been deprecated.} \item{IVIS.activity.metric}{ - Numeric (default = 2). - Metric used for activity calculation. - Value = 1, uses continuous scaled acceleration. - Value = 2, tries to collapse acceleration into a binary score of rest - versus active to try to simulate the original approach.} + This argument has been deprecated.} \item{IVIS_acc_threshold}{ - Numeric (default = 20). - Acceleration threshold to distinguish inactive from active.} + This argument has been deprecated.} \item{qM5L5}{ Numeric (default = NULL). @@ -1386,10 +1394,14 @@ GGIR(mode = 1:5, over which L5M5 needs to be calculated. Now this is done with argument qwindow.} \item{cosinor}{ - Boolean (default = FALSE). Whether to apply the cosinor analysis from the ActCR package.} + Boolean (default = FALSE). Whether to apply the cosinor analysis from the ActCR package in part 2. + In part 6 cosinor analysis is applied by default and cannot be turned off.} \item{part6CR}{ - Boolean (default = FALSE) to indicate whether circadian rhythm analysis should be run by part 6. + Boolean (default = FALSE) to indicate whether circadian rhythm analysis should + be run by part 6, this includes: cosinor analysis, extended cosinor analysis, + IS, IV, and phi. Optionally this can be expanded with detrended fluctutation analysis + which is controlled by parameter `part6DFA`. } \item{part6HCA}{ Boolean (default = FALSE) to indicate whether Household Co Analysis should @@ -1406,6 +1418,10 @@ GGIR(mode = 1:5, ignores the first and last 5 hours, and c("O2", "W10") goes from the second onset till the 10th wakeup time. } + \item{part6DFA}{ + Boolean (default = FALSE) to indicate whether to perform Detrended Fluctuation + Analysis. Turned off by default because it can be time consuming. + } } } diff --git a/man/SSP.Rd b/man/SSP.Rd new file mode 100644 index 000000000..15446962d --- /dev/null +++ b/man/SSP.Rd @@ -0,0 +1,47 @@ +\name{SSP} +\alias{SSP} +\title{Estimated self-similarity parameter} +\description{ + This function estimates the self-similarity parameter (SSP), also known as scaling exponent or alpha. +} +\usage{ + SSP(data,scale = 2^(1/8),box_size = 4,m=1) +} +\arguments{ + \item{data}{ + Univariate time series (must be a vector or data frame) + } + \item{scale}{ + Specifies the ratio between successive box sizes (by default scale = 2^(1/8)) + } + \item{box_size}{ + Vector of box sizes (must be used in conjunction with scale = "F") + } + \item{m}{ + An integer of the polynomial order for the detrending (by default m=1) + } +} +\value{ + Estimated alpha is a real number between zero and two. +} +\details{ + The DFA fluctuation can be computed in a geometric scale or for different choices of boxes sizes. +} +\note{ + It is not possible estimating alpha for multiple time series at once. +} +\examples{ + # Estimate self-similarity of a very known time series available on R base: the sunspot.year. + # Then the spend time with each method is compared. + \dontrun{ + ssp = SSP(sunspot.year) + } +} +\references{ + C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger Phys. Rev. E, 49 (1994), p. 1685 + Mesquita, Victor & Filho, Florencio & Rodrigues, Paulo. (2020). Detection of crossover points in detrended fluctuation analysis: An application to EEG signals of patients with epilepsy. Bioinformatics. 10.1093/bioinformatics/btaa955. +} +\author{ + Ian Meneghel Danilevicz + Victor Barreto Mesquita +} \ No newline at end of file diff --git a/man/applyCosinorAnalyses.Rd b/man/apply_cosinor_IS_IV_Analyses.Rd similarity index 62% rename from man/applyCosinorAnalyses.Rd rename to man/apply_cosinor_IS_IV_Analyses.Rd index 90109af20..d74a4e80c 100644 --- a/man/applyCosinorAnalyses.Rd +++ b/man/apply_cosinor_IS_IV_Analyses.Rd @@ -1,14 +1,14 @@ -\name{applyCosinorAnalyses} -\alias{applyCosinorAnalyses} +\name{apply_cosinor_IS_IV_Analyses} +\alias{apply_cosinor_IS_IV_Analyses} \title{ Apply Cosinor Analyses to time series } \description{ - Wrapper function around \link{cosinorAnalyses} that first prepares the time series + Wrapper function around \link{cosinor_IS_IV_Analyses} that first prepares the time series before applying the cosinorAnlayses } \usage{ - applyCosinorAnalyses(ts, qcheck, midnightsi, epochsizes) + apply_cosinor_IS_IV_Analyses(ts, qcheck, midnightsi, epochsizes, threshold = NULL) } \arguments{ \item{ts}{ @@ -24,6 +24,9 @@ \item{epochsizes}{ Epoch size for ts and qcheck respectively } + \item{threshold}{ + See \link{cosinor_IS_IV_Analyses} + } } \author{ Vincent T van Hees diff --git a/man/cosinorAnalyses.Rd b/man/cosinorAnalyses.Rd deleted file mode 100644 index 9fd9ae4e1..000000000 --- a/man/cosinorAnalyses.Rd +++ /dev/null @@ -1,26 +0,0 @@ -\name{cosinorAnalyses} -\alias{cosinorAnalyses} -\title{ - Apply cosinor anlaysis and extended cosinor analysis -} -\description{ - Applies cosinor anlaysis from the ActCR package to the time series -} -\usage{ - cosinorAnalyses(Xi, epochsize = 60, timeOffsetHours = 0) -} -\arguments{ - \item{Xi}{ - Vector with time series of movement indicators - } - \item{epochsize}{ - Numeric epochsize in seconds - } - \item{timeOffsetHours}{ - Numeric time in hours relative to next midnight - } -} -\keyword{internal} -\author{ - Vincent T van Hees -} \ No newline at end of file diff --git a/man/cosinor_IS_IV_Analyses.Rd b/man/cosinor_IS_IV_Analyses.Rd new file mode 100644 index 000000000..fc74f5a27 --- /dev/null +++ b/man/cosinor_IS_IV_Analyses.Rd @@ -0,0 +1,32 @@ +\name{cosinor_IS_IV_Analyses} +\alias{cosinor_IS_IV_Analyses} +\title{ + Apply cosinor anlaysis and extended cosinor analysis +} +\description{ + Applies cosinor anlaysis from the ActCR package to the time series, as well + as IV, IS and phi estimates. +} +\usage{ + cosinor_IS_IV_Analyses(Xi, epochsize = 60, timeOffsetHours = 0, threshold = NULL) +} +\arguments{ + \item{Xi}{ + Vector with time series of movement indicators if the maximum < 8 and mean < 1 + then input is assumed to be in g-units and is multiplied by 1000. + } + \item{epochsize}{ + Numeric epochsize in seconds + } + \item{timeOffsetHours}{ + Numeric time in hours relative to next midnight + } + \item{threshold}{ + Numeric value to use as threshold to distinguish inactivity from active behaviour + for the IV and IS analysis. GGIR uses parameter threshold.lig to set this threshold. + } +} +\keyword{internal} +\author{ + Vincent T van Hees +} \ No newline at end of file diff --git a/man/g.IVIS.Rd b/man/g.IVIS.Rd index 5a1ade7d7..c7598a362 100644 --- a/man/g.IVIS.Rd +++ b/man/g.IVIS.Rd @@ -8,37 +8,18 @@ van Someren. } \usage{ - g.IVIS(Xi, epochsizesecondsXi = 5, IVIS_epochsize_seconds = c(), - IVIS_windowsize_minutes = 60, IVIS.activity.metric = 1, - IVIS_acc_threshold = 20, IVIS_per_daypair = FALSE) + g.IVIS(Xi, epochSize = 60, threshold = NULL) } \arguments{ \item{Xi}{ Vector with acceleration values, e.g. ENMO metric. } - \item{epochsizesecondsXi}{ + \item{epochSize}{ Epoch size of the values in Xi expressed in seconds. } - \item{IVIS_epochsize_seconds}{ - This argument has been depricated. - } - \item{IVIS_windowsize_minutes}{ - Window size of the Intradaily Variability (IV) and Interdaily - Stability (IS) metrics in minutes, needs to be able to add up to 24 hours. - } - \item{IVIS.activity.metric}{ - Metric used for activity calculation. - Value = 1, uses continuous scaled acceleration. - Value = 2, tries to collapse acceleration into a binary score of rest - versus active to try to simulate the original approach. - } - \item{IVIS_acc_threshold}{ + \item{threshold}{ Acceleration threshold to distinguish inactive from active } - \item{IVIS_per_daypair}{ - Boolean to indicate whether IVIS should be calculated per day pair and - then aggregated across day pairs weighted by day completeness (default FALSE). - } } \value{ \item{InterdailyStability}{} @@ -50,6 +31,7 @@ } \keyword{internal} \author{ + Ian Meneghel Danilevicz Vincent T van Hees } \references{ diff --git a/man/g.analyse.avday.Rd b/man/g.analyse.avday.Rd index 6804577d5..f640fd8e6 100644 --- a/man/g.analyse.avday.Rd +++ b/man/g.analyse.avday.Rd @@ -10,7 +10,8 @@ \usage{ g.analyse.avday(doquan, averageday, M, IMP, t_TWDI, quantiletype, ws3, doiglevels, firstmidnighti, ws2, midnightsi, - params_247 = c(), qcheck = c(), acc.metric = c(), ...) + params_247 = c(), qcheck = c(), acc.metric = c(), + params_phyact = NULL, ...) } \arguments{ \item{doquan}{Boolean whether quantile analysis should be done} @@ -25,27 +26,28 @@ \item{ws2}{see \link{g.weardec}} \item{midnightsi}{see \link{g.detecmidnight}} \item{params_247}{ - See \link{g.part2} + See \link{GGIR} } \item{qcheck}{ Vector with indicators of when data is valid (value=0) or invalid (value=1). } \item{acc.metric}{ - Character, see \link{g.part1}. Here, it is used to decided which acceleration metric + Character, see \link{GGIR}. Here, it is used to decided which acceleration metric to use for IVIS and cosinor analyses. } + \item{params_phyact}{ + See \link{GGIR} + } \item{...}{ Any argument used in the previous version of g.analyse.avday, which will now be used to overrule the arguments specified with the parameter objects. } } \value{ - \item{\code{InterdailyStability}}{} - \item{\code{IntradailyVariability}}{} - \item{\code{igfullr_names}}{} - \item{\code{igfullr}}{} - \item{\code{QUAN}}{} - \item{\code{qlevels_names}}{} + \item{\code{igfullr_names}}{Intensity gradient variable names} + \item{\code{igfullr}}{Intensity gradient values} + \item{\code{QUAN}}{Quantiles} + \item{\code{qlevels_names}}{Quantile level names} \item{\code{ML5AD}}{} \item{\code{ML5AD_names}}{} } diff --git a/man/g.part5.savetimeseries.Rd b/man/g.part5.savetimeseries.Rd index 61dfa8737..7809a9240 100644 --- a/man/g.part5.savetimeseries.Rd +++ b/man/g.part5.savetimeseries.Rd @@ -9,12 +9,14 @@ } \usage{ g.part5.savetimeseries(ts, LEVELS, desiredtz, rawlevels_fname, - DaCleanFile = NULL, - includedaycrit.part5 = 2/3, - ID = NULL, params_output, - params_247 = NULL, - filename = "", - timewindow = NULL) + DaCleanFile = NULL, + includedaycrit.part5 = 2/3, + includenightcrit.part5 = 0, + ID = NULL, + params_output, + params_247 = NULL, + Lnames = NULL, timewindow = NULL, + filename = "") } \arguments{ \item{ts}{ @@ -36,10 +38,15 @@ series files stored. } \item{includedaycrit.part5}{ - See \link{g.report.part5}. Only used in this function if + See \link{GGIR}. Only used in this function if save_ms5rawlevels is TRUE, and it only affects the time series files stored. } + \item{includenightcrit.part5}{ + See \link{GGIR}. Only used in this function if + save_ms5rawlevels is TRUE, and it only affects the time + series files stored. + } \item{ID}{ If data_cleaning_file is used then this argument specifies which participant ID the data correspond with. @@ -50,13 +57,17 @@ \item{params_247}{ See \link{GGIR} } - \item{filename}{ - Character (default = "") indicating the name of the accelerometer data file - that was used as input. This name will be stored inside the time series output file. + \item{Lnames}{ + Level names as passed on from \link{identify_levels}, these are the names + corresponding the ID of the behavioural classes as stored in column class_id. } \item{timewindow}{ See \link{GGIR} } + \item{filename}{ + Character (default = "") indicating the name of the accelerometer data file + that was used as input. This name will be stored inside the time series output file. + } } \value{ Function does not provide output, it only prepare data for saving diff --git a/man/g.part6.Rd b/man/g.part6.Rd index d9fff98a4..75e70bcf9 100644 --- a/man/g.part6.Rd +++ b/man/g.part6.Rd @@ -10,7 +10,7 @@ \usage{ g.part6(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), params_general = c(), params_phyact = c(), params_247 = c(), - verbose = TRUE, ...) + params_cleaning = c(), verbose = TRUE, ...) } \arguments{ \item{datadir}{ @@ -40,6 +40,9 @@ g.part6(datadir = c(), metadatadir = c(), f0 = c(), f1 = c(), \item{params_247}{ See details in \link{GGIR}. } + \item{params_cleaning}{ + See details in \link{GGIR}. + } \item{verbose}{ See details in \link{GGIR}. } diff --git a/tests/testthat/test_chainof5parts.R b/tests/testthat/test_chainof5parts.R index d39f6d3ba..d63f206d7 100644 --- a/tests/testthat/test_chainof5parts.R +++ b/tests/testthat/test_chainof5parts.R @@ -217,8 +217,8 @@ test_that("chainof5parts", { expect_true(dir.exists(dirname)) expect_true(file.exists(rn[1])) - expect_that(nrow(output),equals(5)) # changed because OO window is exported - expect_that(ncol(output),equals(198)) + expect_that(nrow(output),equals(5)) + expect_that(ncol(output),equals(184)) expect_that(round(as.numeric(output$wakeup[2]), digits = 4), equals(36)) expect_that(as.numeric(output$dur_day_spt_min[4]), equals(1150)) # WW window duration expect_that(as.numeric(output$dur_day_spt_min[5]), equals(1680)) # OO window duration @@ -226,7 +226,7 @@ test_that("chainof5parts", { rn2 = dir(dirname_raw,full.names = TRUE, recursive = T) expect_true(file.exists(rn2[1])) TSFILE = read.csv(rn2[1]) - expect_that(nrow(TSFILE),equals(1150)) + expect_that(nrow(TSFILE),equals(2820)) expect_equal(ncol(TSFILE), 12) expect_equal(length(unique(TSFILE$class_id)), 10) #GGIR diff --git a/tests/testthat/test_cosinor.R b/tests/testthat/test_cosinor.R index 11537498d..cc914c3a7 100644 --- a/tests/testthat/test_cosinor.R +++ b/tests/testthat/test_cosinor.R @@ -1,5 +1,5 @@ library(GGIR) -context("cosinorAnalyses") +context("cosinor_IS_IV_Analyses") test_that("cosinorAnalyses provides expected output", { ActCosDummy = function(epochSizeSeconds, missingdata = FALSE, timeOffsetHours = 0) { @@ -13,9 +13,11 @@ test_that("cosinorAnalyses provides expected output", { if (missingdata == TRUE) { is.na(counts[N:floor(N * 2)]) = TRUE } + # Plot code below commented out, but useful for debugging to + # understand how dummy time series looks like. # x11() # plot(log((counts * 1000) + 1), type = "l") - mod = cosinorAnalyses(Xi = log((counts * 1000) + 1), epochsize = epochSizeSeconds, timeOffsetHours = timeOffsetHours) + mod = cosinor_IS_IV_Analyses(Xi = counts * 1000, epochsize = epochSizeSeconds, timeOffsetHours = timeOffsetHours) mod$coefext = mod$coefext[which(names(mod$coefext) != "cosinor_ts")] return(mod) } @@ -51,8 +53,8 @@ test_that("cosinorAnalyses provides expected output", { expect_equal(coef60$coefext$params$R2, 0.8976208, tolerance = 0.01) # IV IS - expect_equal(coef60$IVIS$InterdailyStability, 0.9945789, tolerance = 0.01) - expect_equal(coef60$IVIS$IntradailyVariability, 1.14, tolerance = 0.01) + expect_equal(coef60$IVIS$InterdailyStability, 1, tolerance = 0.01) + expect_equal(coef60$IVIS$IntradailyVariability, 0.06774028, tolerance = 0.01) fields_to_compare = c("minimum", "amp", "alpha", "beta", "acrotime", "DownMesor", "MESOR") expect_equal(coef60$coefext[fields_to_compare], coef300$coefext[fields_to_compare], tolerance = 0.1) diff --git a/tests/testthat/test_dfa.R b/tests/testthat/test_dfa.R new file mode 100644 index 000000000..f84617805 --- /dev/null +++ b/tests/testthat/test_dfa.R @@ -0,0 +1,27 @@ +library(GGIR) +context("DFA, SSP and ABI") +test_that("DFA produces expected abi and ssp from sunspot.year dataset", { + skip_on_cran() + ssp = SSP(sunspot.year) + abi = ABI(ssp) + expect_equal(abi, 0.145756, tolerance = 0.0001) + expect_equal(ssp, 0.7393684, tolerance = 0.0001) +}) + +test_that("DFA skips calculation if there are NA values", { + skip_on_cran() + sy_tmp = sunspot.year + sy_tmp[5] = NA + ssp = SSP(sy_tmp) + abi = ABI(ssp) + expect_equal(abi, NA) + expect_equal(ssp, NA) +}) + +test_that("DFA can handle too short time series", { + skip_on_cran() + ssp = SSP(sunspot.year[1:2]) + abi = ABI(ssp) + expect_equal(abi, NA) + expect_equal(ssp, NA) +}) \ No newline at end of file diff --git a/tests/testthat/test_fragmentation.R b/tests/testthat/test_fragmentation.R index 3339cefa7..102f399cc 100644 --- a/tests/testthat/test_fragmentation.R +++ b/tests/testthat/test_fragmentation.R @@ -4,8 +4,11 @@ test_that("fragmentation calculates the expected fragmentation metric values", { skip_on_cran() Lnames = c("spt_sleep", "spt_wake_IN", "spt_wake_LIG", "spt_wake_MOD", "spt_wake_VIG", - "day_IN_unbt", "day_LIG_unbt", "day_MOD_unbt", "day_VIG_unbt", "day_MVPA_bts_10", "day_IN_bts_30", + "day_IN_unbt", "day_LIG_unbt", "day_MOD_unbt", "day_VIG_unbt", + "day_MVPA_bts_10", "day_IN_bts_30", "day_IN_bts_10_30", "day_LIG_bts_10") + + # Fragmentation of daytime behaviour x = c(6, 5, 6, 7, 6, 6, 7, 6, 6, 5, 6, 6, 6, 5, 7, 6, 6, 5, 5, 5, 6, 7, 6, 6, 6, 6, 7, 6, 5, 5, 5, 5, 5, 6, 6, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 6, 5, 6, 5, 6, 5, rep(12, 11), 5, 6, 6, 6, 5, 6, rep(9, 14), 6, @@ -21,7 +24,6 @@ test_that("fragmentation calculates the expected fragmentation metric values", { 6, 6, 6, rep(11, 22), 6, 6, rep(10, 100), 6, rep(11, 23), 6, 5, 5, 5, 5, 5, 7, rep(10, 133), 6, rep(10, 119), 6, 6, 5, 6, 6, 6, 6, 6, 7, 7, 7, 6, 5, 6, 6, 6, 5, 5, 5) - out = g.fragmentation(frag.metrics = "all", LEVELS = x, @@ -31,7 +33,6 @@ test_that("fragmentation calculates the expected fragmentation metric values", { expect_equal(out$mean_dur_IN, 15.10417, tolerance = 0.0001) expect_equal(out$Gini_dur_PA, 0.506352, tolerance = 0.0001) expect_equal(out$Gini_dur_IN, 0.768544, tolerance = 0.0001) - expect_equal(out$alpha_dur_PA, 2.055102, tolerance = 0.0001) expect_equal(out$x0.5_dur_PA, 1.928897, tolerance = 0.0001) expect_equal(out$W0.5_dur_IN, 0.9629121, tolerance = 0.0001) @@ -47,22 +48,60 @@ test_that("fragmentation calculates the expected fragmentation metric values", { expect_equal(out$Nfrag_IN2MVPA, 4) expect_equal(out$Nfrag_IN, 48) - x = c(rep(1, 15), rep(0, 10), rep(1, 15), rep(2, 15), - rep(0, 15), rep(1, 10), rep(2, 10), - rep(0, 25), rep(1, 10)) + # Fragmentation of nighttime (SPT) behaviour + # classes: wake 1, sleep 0 + x = c(rep(1, 15), rep(0, 10), rep(1, 30), + rep(0, 15), rep(1, 20), rep(0, 25), rep(1, 10)) out = g.fragmentation(frag.metrics = "all", LEVELS = x, Lnames = Lnames, mode = "spt") - - expect_equal(out$Nfrag_PA, 2) - expect_equal(out$Nfrag_IN, 2) - expect_equal(out$TP_PA2IN, 0.08, tolerance = 0.0001) - expect_equal(out$TP_IN2PA, 0.020202, tolerance = 0.0001) expect_equal(out$Nfrag_wake, 3) expect_equal(out$Nfrag_sleep, 3) - expect_equal(out$TP_wake2sleep, 0.040541, tolerance = 0.0001) - expect_equal(out$TP_sleep2wake, 0.06, tolerance = 0.0001) + expect_equal(out$TP_wake2sleep, 0.040541, tolerance = 0.0001) # 3+epsilon / 65+epsilon + expect_equal(out$TP_sleep2wake, 0.06, tolerance = 0.0001) # 3+epsilon / 50+epsilon + + # SPT: Test ability to handle NA blocks as needed for g.part6 + # classes: wake 1, sleep 0 + x = c(rep(1, 5), rep(0, 5), rep(1, 5), rep(0, 5), rep(NA, 15), + rep(0, 5), rep(1, 5), rep(0, 5), rep(1, 5), rep(NA, 15), + rep(1, 5), rep(0, 5), rep(1, 5), rep(0, 5)) + + out = g.fragmentation(frag.metrics = "all", + LEVELS = x, + Lnames = Lnames, mode = "spt") + expect_equal(out$Nfrag_sleep, 4) + expect_equal(out$Nfrag_wake, 5) + expect_equal(out$TP_sleep2wake, 0.142857, tolerance = 0.0001) + expect_equal(out$TP_wake2sleep, 0.172414, tolerance = 0.0001) + + # day: Test ability to handle NA blocks as needed for g.part6 + # 9 is MVPA, 10 is IN + x = c(rep(10, 5), rep(9, 5), rep(10, 5), rep(9, 5), rep(NA, 15), + rep(9, 5), rep(10, 5), rep(9, 5), rep(10, 5), rep(NA, 15), + rep(10, 5), rep(9, 5), rep(10, 5), rep(9, 5)) + + out = g.fragmentation(frag.metrics = "all", + LEVELS = x, + Lnames = Lnames, mode = "day") + expect_equal(out$Nfrag_PA2IN, 4) # 4+epsilon/20+epsilon + expect_equal(out$Nfrag_IN2PA, 5) # 5+epsilon/25+epsilon + expect_equal(out$TP_PA2IN, 0.142857, tolerance = 0.0001) + expect_equal(out$TP_IN2PA, 0.172414, tolerance = 0.0001) + + + # same but now with more variation in segment length + x = c(rep(10, 4), rep(9, 6), rep(10, 9), rep(9, 2), rep(NA, 12), + rep(9, 8), rep(10, 5), rep(9, 3), rep(10, 20), rep(NA, 19), + rep(10, 3), rep(9, 2), rep(10, 7), rep(9, 7)) + + out = g.fragmentation(frag.metrics = "all", + LEVELS = x, + Lnames = Lnames, mode = "day") + expect_equal(out$Nfrag_PA2IN, 4) # 4+epsilon/20+epsilon + expect_equal(out$Nfrag_IN2PA, 5) # 5+epsilon/25+epsilon + expect_equal(out$TP_PA2IN, 0.153846, tolerance = 0.0001) + expect_equal(out$TP_IN2PA, 0.106383, tolerance = 0.0001) }) \ No newline at end of file diff --git a/tests/testthat/test_g.IVIS.R b/tests/testthat/test_g.IVIS.R index 85e80ab91..1e98bd1a8 100644 --- a/tests/testthat/test_g.IVIS.R +++ b/tests/testthat/test_g.IVIS.R @@ -2,30 +2,32 @@ library(GGIR) context("g.IVIS") test_that("g.IVIS returns expected output value without missing data", { set.seed(1234) - xx = rnorm(n = 1440) + xx = rnorm(n = 1440) * 1000 Xi1 = rep(xx, 7) Xi2 = rep(0:7, each = 1440) - T1 = g.IVIS(Xi1, epochsizesecondsXi = 60, IVIS_windowsize_minutes = 60, IVIS.activity.metric = 1, IVIS_per_daypair = FALSE) - T2 = g.IVIS(Xi2, epochsizesecondsXi = 60, IVIS_windowsize_minutes = 60, IVIS.activity.metric = 1, IVIS_per_daypair = FALSE) + T1 = g.IVIS(Xi1, epochSize = 60) + T2 = g.IVIS(Xi2, epochSize = 60) expect_equal(T1$InterdailyStability, 1, tolerance = 0.01) # every day is the same, so we expect IS = 1 expect_equal(T2$InterdailyStability, 0, tolerance = 0.01) # every day is entirely different, so IS = 0 - expect_equal(T1$IntradailyVariability, 1.97, tolerance = 0.05) #within the day there is variation, so IV>0 + expect_equal(T1$IntradailyVariability, 2.244335, tolerance = 0.05) #within the day there is variation, so IV>0 expect_equal(T2$IntradailyVariability, 0, tolerance = 0.01) #within the day there is no variation, so IV=0 - - + expect_equal(T1$phi, -0.1453851, tolerance = 0.001) + expect_equal(T2$phi, 0.9982517, tolerance = 0.001) }) test_that("g.IVIS returns expected output value with missing data", { set.seed(1234) - xx = rnorm(n = 1440) + xx = abs(rnorm(n = 1440)) Xi1 = rep(xx, 7) Xi2 = rep(0:7, each = 1440) is.na(Xi1[1441:2880]) = TRUE is.na(Xi2[1441:2880]) = TRUE - T3 = g.IVIS(Xi1, epochsizesecondsXi = 60, IVIS_windowsize_minutes = 60, IVIS.activity.metric = 1, IVIS_per_daypair = TRUE) - T4 = g.IVIS(Xi2, epochsizesecondsXi = 60, IVIS_windowsize_minutes = 60, IVIS.activity.metric = 1, IVIS_per_daypair = TRUE) + T3 = g.IVIS(Xi1, epochSize = 60) + T4 = g.IVIS(Xi2, epochSize = 60) expect_equal(T3$InterdailyStability, 1, tolerance = 0.01) # every day is the same, so we expect IS = 1 expect_equal(T4$InterdailyStability, 0, tolerance = 0.01) # every day is entirely different, so IS = 0 - expect_equal(T3$IntradailyVariability, 1.97, tolerance = 0.05) #within the day there is variation, so IV>0 + expect_equal(T3$IntradailyVariability, 2.51238, tolerance = 0.05) #within the day there is variation, so IV>0 expect_equal(T4$IntradailyVariability, 0, tolerance = 0.01) #within the day there is no variation, so IV=0 + expect_equal(T3$phi, -0.2577584, tolerance = 0.001) + expect_equal(T4$phi, 0.9972666, tolerance = 0.001) }) \ No newline at end of file diff --git a/tests/testthat/test_load_check_params.R b/tests/testthat/test_load_check_params.R index ea8e2df3f..9d149c0f7 100644 --- a/tests/testthat/test_load_check_params.R +++ b/tests/testthat/test_load_check_params.R @@ -14,8 +14,8 @@ test_that("load_params can load parameters", { expect_equal(length(params$params_sleep), 22) expect_equal(length(params$params_metrics), 41) expect_equal(length(params$params_rawdata), 39) - expect_equal(length(params$params_247), 22) - expect_equal(length(params$params_cleaning), 24) + expect_equal(length(params$params_247), 23) + expect_equal(length(params$params_cleaning), 26) expect_equal(length(params$params_phyact), 14) expect_equal(length(params$params_output), 22) expect_equal(length(params$params_general), 17) diff --git a/tests/testthat/test_part6.R b/tests/testthat/test_part6.R index 7a181d782..e5d428efc 100644 --- a/tests/testthat/test_part6.R +++ b/tests/testthat/test_part6.R @@ -16,22 +16,8 @@ test_that("Part 6 with household co-analysis", { save(M, file = paste0(dn2, "/meta_800-900-002_left wrist.bin.RData")) save(M, file = paste0(dn2, "/meta_800-900-003_left wrist.bin.RData")) - # Use time shifts to simulate three household members - data(data.ts) - mdat = data.ts - mdat$timenum = mdat$timenum - (5 * 60) - filename = "800-900-001_left wrist.bin" - save(mdat, filename, file = paste0(dn, "/800-900-001_left wrist.RData")) - mdat$timenum = mdat$timenum + (7 * 60) - filename = "800-900-002_left wrist.bin" - save(mdat, filename, file = paste0(dn, "/800-900-002_left wrist.RData")) - mdat$timenum = mdat$timenum + (14 * 60) - filename = "800-900-003_left wrist.bin" - save(mdat, filename, file = paste0(dn, "/800-900-003_left wrist.RData")) - - # Run household co-analysis - # Update parameters to align with dataset + # Update parameters to align with datset params_general = load_params(topic = "general")$params_general params_general[["desiredtz"]] = "America/Curacao" params_phyact = load_params(topic = "phyact")$params_phyact @@ -42,6 +28,31 @@ test_that("Part 6 with household co-analysis", { params_phyact[["threshold.vig"]] = thresholds[3] params_247 = load_params(topic = "247")$params_247 params_247[["part6HCA"]] = TRUE + + # Use time shifts to simulate three household members + data(data.ts) + mdat = data.ts + + Lnames = c("spt_sleep", "spt_wake_IN", "spt_wake_LIG", "spt_wake_MOD", + "spt_wake_VIG", "day_IN_unbt", "day_LIG_unbt", "day_MOD_unbt", + "day_VIG_unbt", "day_MVPA_bts_10", "day_IN_bts_30", + "day_IN_bts_10_30", "day_LIG_bts_10") + + mdat$timenum = mdat$timenum - (5 * 60) + mdat$timestamp = as.POSIXct(mdat$timenum, tz = params_general[["desiredtz"]]) + filename = "800-900-001_left wrist.bin" + save(mdat, filename, Lnames, file = paste0(dn, "/800-900-001_left wrist.RData")) + mdat$timenum = mdat$timenum + (7 * 60) + mdat$timestamp = as.POSIXct(mdat$timenum, tz = params_general[["desiredtz"]]) + filename = "800-900-002_left wrist.bin" + save(mdat, filename, Lnames, file = paste0(dn, "/800-900-002_left wrist.RData")) + mdat$timenum = mdat$timenum + (14 * 60) + mdat$timestamp = as.POSIXct(mdat$timenum, tz = params_general[["desiredtz"]]) + filename = "800-900-003_left wrist.bin" + save(mdat, filename, Lnames, file = paste0(dn, "/800-900-003_left wrist.RData")) + mdat_file3 = mdat + + # Run household co-analysis g.part6(metadatadir = metadatadir, params_general = params_general, params_phyact = params_phyact, @@ -70,6 +81,15 @@ test_that("Part 6 with household co-analysis", { expect_equal(PC$wakeup_time1[c(1, 8, 16)], c("01:35:00", "02:58:00", "03:39:00")) expect_equal(sum(PC$day_Kappa_active), 5.091) + #=================================================================== + # For circadian rhythm analysis we set one window to invalid + mdat_file3$invalid_wakinghours[which(mdat_file3$window == 4)] = 100 # set one window to invalid + mdat_file3$invalid_sleepperiod[which(mdat_file3$window == 4)] = 100 # set one window to invalid + mdat_file3$invalid_fullwindow[which(mdat_file3$window == 4)] = 100 # set one window to invalid + mdat_file3$ACC[which(mdat$window == 3)] = NA # set one window to invalid + mdat = mdat_file3 + filename = "800-900-003_left wrist.bin" + save(mdat, filename, Lnames, file = paste0(dn, "/800-900-003_left wrist.RData")) # Run Circadian rhythm analysis with default window params_247[["part6HCA"]] = FALSE @@ -78,23 +98,24 @@ test_that("Part 6 with household co-analysis", { params_247[["cosinor"]] = TRUE params_247[["part6CR"]] = TRUE params_247[["part6Window"]] = c("start", "end") + params_cleaning = load_params(topic = "cleaning")$params_cleaning g.part6(metadatadir = metadatadir, params_general = params_general, params_phyact = params_phyact, params_247 = params_247, + params_cleaning = params_cleaning, + f0 = 1, f1 = 3, verbose = FALSE) - path_to_ms6 = paste0(metadatadir, "/meta/ms6.out/800-900-001_left wrist.RData") - + path_to_ms6 = paste0(metadatadir, "/meta/ms6.out/800-900-003_left wrist.RData") expect_true(file.exists(path_to_ms6)) load(path_to_ms6) - expect_equal(ncol(output_part6), 26) - expect_equal(output_part6$starttime, "2022-06-02 03:00:00") - expect_equal(output_part6$cosinor_mes, 2.451769, tolerance = 0.00001) - expect_equal(output_part6$cosinorExt_minimum, 1.288636, tolerance = 0.00001) - expect_equal(output_part6$cosinorExt_MESOR, 2.164644, tolerance = 0.00001) - expect_equal(sum(output_part6[5:25]), 327.5437, tolerance = 0.0001) - + expect_equal(ncol(output_part6), 48) + expect_equal(output_part6$starttime, "2022-06-02 03:16:00") + expect_equal(output_part6$cosinor_mes, 2.471082, tolerance = 0.00001) + expect_equal(output_part6$cosinorExt_minimum, 1.302779, tolerance = 0.00001) + expect_equal(output_part6$cosinorExt_MESOR, 2.171, tolerance = 0.00001) + expect_equal(sum(output_part6[5:27]), 620.6144, tolerance = 0.0001) # Run Circadian rhythm analysis with non-default window params_247[["part6Window"]] = c("W2", "W-1") # second wake till last wake @@ -102,19 +123,51 @@ test_that("Part 6 with household co-analysis", { params_general = params_general, params_phyact = params_phyact, params_247 = params_247, + params_cleaning = params_cleaning, + f0 = 1, f1 = 3, verbose = FALSE) - path_to_ms6 = paste0(metadatadir, "/meta/ms6.out/800-900-001_left wrist.RData") + path_to_ms6 = paste0(metadatadir, "/meta/ms6.out/800-900-003_left wrist.RData") expect_true(file.exists(path_to_ms6)) load(path_to_ms6) - expect_equal(ncol(output_part6), 26) - expect_equal(output_part6$starttime, "2022-06-03 01:41:00") - expect_equal(output_part6$cosinor_mes, 2.448485, tolerance = 0.00001) - expect_equal(output_part6$cosinorExt_minimum, 0, tolerance = 0.00001) - expect_equal(output_part6$cosinorExt_MESOR, 1.6977, tolerance = 0.00001) - expect_equal(sum(output_part6[5:25]), 154.1642, tolerance = 0.0001) + expect_equal(ncol(output_part6), 48) + expect_equal(output_part6$starttime, "2022-06-03 01:57:00") + expect_equal(output_part6$cosinor_mes, 2.431978, tolerance = 0.00001) + expect_equal(output_part6$cosinorExt_minimum, 1.286542, tolerance = 0.00001) + expect_equal(output_part6$cosinorExt_MESOR, 2.140128, tolerance = 0.00001) + expect_equal(sum(output_part6[5:27]), 680.9232, tolerance = 0.0001) + + output_part6_with_invalid = output_part6 + rm(output_part6) + # Assess impact of storing part 5 output without invalid + mdat = mdat_file3[-which(mdat$window == 4),] # delete the invalid window + filename = "800-900-003_left wrist.bin" + save(mdat, filename, Lnames, file = paste0(dn, "/800-900-003_left wrist.RData")) + + g.part6(metadatadir = metadatadir, + params_general = params_general, + params_phyact = params_phyact, + params_247 = params_247, + params_cleaning = params_cleaning, + f0 = 1, f1 = 3, + verbose = FALSE) + path_to_ms6 = paste0(metadatadir, "/meta/ms6.out/800-900-003_left wrist.RData") + + expect_true(file.exists(path_to_ms6)) + + load(path_to_ms6) + output_part6_without_invalid = output_part6 + + # Compare output_part6_with_invalid and output_part6_without_invalid + expect_equal(ncol(output_part6_with_invalid), ncol(output_part6_without_invalid)) + expect_equal(output_part6_with_invalid$starttime, output_part6_without_invalid$starttime) + expect_equal(output_part6_with_invalid$cosinor_mes, output_part6_without_invalid$cosinor_mes, tolerance = 0.00001) + expect_equal(output_part6_with_invalid$cosinorExt_minimum, output_part6_without_invalid$cosinorExt_minimum, tolerance = 0.00001) + expect_equal(output_part6_with_invalid$cosinorExt_MESOR, output_part6_without_invalid$cosinorExt_MESOR, tolerance = 0.00001) + expect_equal(sum(output_part6_with_invalid[5:27]), sum(output_part6_without_invalid[5:27]), tolerance = 0.0001) + # Remove test files if (file.exists(metadatadir)) unlink(metadatadir, recursive = TRUE) } diff --git a/vignettes/GGIR.Rmd b/vignettes/GGIR.Rmd index ca3809e9b..99152bc0a 100644 --- a/vignettes/GGIR.Rmd +++ b/vignettes/GGIR.Rmd @@ -140,7 +140,7 @@ additional packages: - If you want to derive Neishabouricounts (with `do.neishabouricounts = TRUE`), install the actilifecounts package with `install.packages("actilifecounts")` - If you want to derive circadian rhythm indicators using the [Cosinor analysis and Extended Cosinor analysis] - (with `cosinor = TRUE`), install the ActCR package with `install.packages("ActCR")` + (with `cosinor = TRUE` for part 2, in part 6 it is always performed), install the ActCR package with `install.packages("ActCR")` ## Prepare folder structure @@ -444,7 +444,6 @@ This section has been migrated to this [section](https://wadpac.github.io/GGIR/a This section has been migrated to this [section](https://wadpac.github.io/GGIR/articles/GGIRoutput.html) in the GGIR github-pages, which is now the main documentation resource for GGIR. - ### Day level summary This section has been migrated to this [section](https://wadpac.github.io/GGIR/articles/GGIRoutput.html) in the GGIR github-pages, which is now the main documentation resource for GGIR. @@ -481,6 +480,28 @@ This section has been migrated to this [section](https://wadpac.github.io/GGIR/a This section has been migrated to this [section](https://wadpac.github.io/GGIR/articles/GGIRoutput.html) in the GGIR github-pages, which is now the main documentation resource for GGIR. + +## Output part 6 {.tabset} + +When `part6CR = TRUE` and the vector as specified for paramter `do.report` includes the number 6, a csv report will be stored with the following variables: + + +| Variable name | Description | +|-----------------------------|----------------------------------------------| +| ID | Participant ID extracted from file | +| starttime | Start timestamp for the data as used for the analysis | +| filename | File name | +| N_days | Number of days in the data as used for the analysis | +| N_valid_days | Number of valid days in the data as used for the analysis, defined as the cumulative number of epoch expressed in days | +| `cosinor_` | Cosinor analysis estimates such as mes, amp, acrophase, and acrotime, as documented in the [Ac tCR](https://CRAN.R-project.org/package=ActCR) package. | +| `cosinorExt_` | Extended Cosinor analysis estimates such as minimum, amp, alpha, beta, acrotime, UpMesor, DownMesor, MESOR, and F_pseudo, as documented in the [Ac tCR](https://CRAN.R-project.org/package=ActCR) package.| +| `IV` | Intradaily Variability (IV) after Eus J. W. Van Someren (Chronobiology International. 1999. Volume 16, issue 4). Implementation described in Ian Meneghel Danilevicz et al. https://doi.org/10.21203/rs.3.rs-3543711/v1 | +| `IS` | Interdaily Stability (IS) after Eus J. W. Van Someren (Chronobiology International. 1999. Volume 16, issue 4). Implementation described in Ian Meneghel Danilevicz et al. https://doi.org/10.21203/rs.3.rs-3543711/v1 | +| phi | Ian Meneghel Danilevicz et al. https://doi.org/10.21203/rs.3.rs-3543711/v1 | +| SSP | Only created when `part6DFA = TRUE`. Self-Similarity parameter as part of the Detrended Fluctuation Analysis. Implementation described in Ian Meneghel Danilevicz et al. https://doi.org/10.21203/rs.3.rs-3543711/v1. | +| ABI | Only created when `part6DFA = TRUE`. Activity Balance Index. Implementation described in Ian Meneghel Danilevicz et al. https://doi.org/10.21203/rs.3.rs-3543711/v1.| + + # Motivation and clarification In this chapter we will try to collect motivations and clarification @@ -676,7 +697,6 @@ I wanted a short name and not to spend too much time finding it. The abbreviatio This section has been migrated to this [section](https://wadpac.github.io/GGIR/articles/chapter13_CircadianRhythm.html) in the GGIR github-pages, which is now the main documentation resource for GGIR. - ## ActiGraph's idle sleep mode The idle sleep mode is [explained](https://actigraphcorp.my.site.com/support/s/article/Idle-Sleep-Mode-Explained) diff --git a/vignettes/GGIRParameters.Rmd b/vignettes/GGIRParameters.Rmd index 390c899f0..2ab911f38 100644 --- a/vignettes/GGIRParameters.Rmd +++ b/vignettes/GGIRParameters.Rmd @@ -206,6 +206,7 @@ find a description and default value for all the arguments. | save_ms5rawlevels | 5 | params_output | | part5_agg2_60seconds | 5 | params_general | | includedaycrit.part5 | 5 | params_cleaning | +| includenight.part5 | 5 | params_cleaning | | frag.metrics | 5 | params_phyact | | LUXthresholds | 5 | params_247 | | LUX_cal_constant | 5 | params_247 | @@ -215,6 +216,12 @@ find a description and default value for all the arguments. | save_ms5raw_format | 5 | params_output | | save_ms5raw_without_invalid | 5 | params_output | | do.sibreport | 5 | params_output | +| includecrit.part6 | 6 | params_cleaning | +| part6_threshold_combi | 6 | params_phyact | +| part6CR | 6 | params_247 | +| part6HCA | 6 | params_247 | +| part6Window | 6 | params_247 | +| part6DFA | 6 | params_247 | | require_complete_lastnight_part5 | 5 | params_output | | visualreport_without_invalid| visualreport | params_output | | dofirstpage | visualreport | params_output | diff --git a/vignettes/GGIRoutput.Rmd b/vignettes/GGIRoutput.Rmd index a9cdc8b8b..27ac5afe9 100644 --- a/vignettes/GGIRoutput.Rmd +++ b/vignettes/GGIRoutput.Rmd @@ -13,23 +13,18 @@ vignette: > %\VignetteEngine{knitr::rmarkdown} --- - ```{r, echo=FALSE, out.width = "60%", out.extra='style="border: 0; padding:20px"', fig.alt="GGIR logo"} knitr::include_graphics("GGIR-MASTERLOGO-RGB.png") ``` -See also complementary vignettes on: -[General introduction to GGIR](https://cran.r-project.org/package=GGIR/vignettes/GGIR.html), [Day segment analyses](https://CRAN.R-project.org/package=GGIR/vignettes/TutorialDaySegmentAnalyses.html), [GGIR parameters](https://CRAN.R-project.org/package=GGIR/vignettes/GGIRParameters.html), [Embedding external functions (pdf)](https://CRAN.R-project.org/package=GGIR/vignettes/ExternalFunction.html), and [Reading ad-hoc csv file formats](https://CRAN.R-project.org/package=GGIR/vignettes/readmyacccsv.html). - +See also complementary vignettes on: [General introduction to GGIR](https://cran.r-project.org/package=GGIR/vignettes/GGIR.html), [Day segment analyses](https://CRAN.R-project.org/package=GGIR/vignettes/TutorialDaySegmentAnalyses.html), [GGIR parameters](https://CRAN.R-project.org/package=GGIR/vignettes/GGIRParameters.html), [Embedding external functions (pdf)](https://CRAN.R-project.org/package=GGIR/vignettes/ExternalFunction.html), and [Reading ad-hoc csv file formats](https://CRAN.R-project.org/package=GGIR/vignettes/readmyacccsv.html). GGIR generates the following types of output: -- csv-spreadsheets with all the variables you need for physical activity, sleep and circadian -rhythm research -- Pdfs with on each page a low resolution plot of the data per file and quality indicators -- R objects with milestone data -- Pdfs with a visual summary of the physical activity and sleep patterns as identified (see example below) - +- csv-spreadsheets with all the variables you need for physical activity, sleep and circadian rhythm research +- Pdfs with on each page a low resolution plot of the data per file and quality indicators +- R objects with milestone data +- Pdfs with a visual summary of the physical activity and sleep patterns as identified (see example below) # GGIR Part 1 @@ -41,126 +36,141 @@ Part 2 generates the following output: - part2_summary.csv: Person level summary (see below) - part2_daysummary.csv: Day level summary (see below) -- QC/data_quality_report.csv: Overview of calibration results and - whether or not a file was corrupt or too short to be processed, -- QC/plots to check data quality 1.pdf: A pdf with visualisation of - the acceleration time series in 15 minute resolution and with - invalid data segments highlighted in colours (yellow: non-wear based - on standard deviation threshold, brown: non-wear after extra - filtering step (introduced in 2013), and purple: clipping) +- QC/data_quality_report.csv: Overview of calibration results and whether or not a file was corrupt or too short to be processed, +- QC/plots to check data quality 1.pdf: A pdf with visualisation of the acceleration time series in 15 minute resolution and with invalid data segments highlighted in colours (yellow: non-wear based on standard deviation threshold, brown: non-wear after extra filtering step (introduced in 2013), and purple: clipping) ## Person level summary (csv) -| (Part of) variable name | Description | -|-------------------------|-----------------------------------------------| -| ID | Participant id | -| device_sn | Device serial number | -| bodylocation | Body location extracted from file header | -| filename | Name of the data file| -| start_time | Timestamp when recording started | -| startday | Day of the week on which recording started | -| samplefreq | Sample frequency (Hz)| -| device | Accelerometer brand, e.g. GENEACtiv | -| clipping_score | The Clipping score: Fraction of 15 minute windows per file for which the acceleration in one of the three axis was close to the maximum for at least 80% of the time. This should be 0. | -| meas_dur_dys | Measurement duration (days) | -| complete_24hcycle | Completeness score: Fraction of 15 minute windows per 24 hours for which no valid data is available at any day of the measurement. | -| meas_dur_def_proto_day | measurement duration according to protocol (days): Measurement duration (days) minus the hours that are ignored at the beginning and end of the measurement motivated by protocol design | -| wear_dur_def_proto_day | wear duration duration according to protocol (days): So, if the protocol was seven days of measurement, then wearing the accelerometer for 8 days and recording data for 8 days will still make that the wear duration is 7 days | -| calib_err | Calibration error (static estimate) Estimated based on all 'non-movement' periods in the measurement after applying the autocalibration. | -| calib_status | Calibration status: Summary statement about the status of the calibration error minimisation | -| ENMO_fullRecordingMean | ENMO is the main summary measure of acceleration. The value presented is the average ENMO over all the available data normalised per 24-hour cycles (diurnal balanced), with invalid data imputed by the average at similar time points on different days of the week. In addition to ENMO it is possible to extract other acceleration metrics (i.e. BFEN, HFEN, HFENplus). We emphasize that it is calculated over the full recording because the alternative is that a variable is only calculated overmeasurement days with sufficient valid hours of data. | -| ENMO | (only available if set to true in part1.R) ENMO is the main summary measure of acceleration. The value presented is the average ENMO over all the available data normalised per 24 hour cycles, with invalid data imputed by the average at similar timepoints on different days of the week. In addition to ENMO it is possible to extract other acceleration metrics in part1.R (i.e. BFEN, HFEN, HFENplus) See also [van Hees PLoSONE April 2013](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0061691) for a detailed description and comparison of these techniques. | -| pX_A\_mg\_0-24h_fullRecording | This variable represents the Xth percentile in the distribution of short epoch metric value A of the average day. The average day may not be ideal for describing the distribution. Therefore, the code also extracts the following variable. | -| AD_pX_A\_mg_0-24h | This variable represents the Xth percentile in the distribution of short epoch metric value A per day averaged across all days. | -| L5_A\_mg_0-24 | Average of metric A during the least active five\* hours in the day that is the lowest rolling average value of metric A. (\* window size is modifiable by argument `winhr`) | -| M5_A\_mg_0-24 | Average of metric A during the most active five\* hours in the day that is the lowest rolling average value of metric A. (\* window size is modifiable by argument `winhr`) | -| L5hr_A\_mg_0-24 | Starting time in hours and fractions of hours of L5_A\_mg_0-24, where hours below 12 are incremented with 24 to create a continuous scale throughout the night (e.g. 36 = 6am) in line with numeric timeing of sleep variables in GGIR part 4 output. | -| M5hr_A\_mg_0-24 | Starting time in hours and fractions of hours of M5_A\_mg_0-24 | -| ig_gradient_ENMO_0 -24hr_fullRecording | Intensity gradient calculated over the full recording. | -| 1to6am_ENMO_mg | Average metric value ENMO between 1am and 6am | -| N valid WEdays | Number of valid weekend days| -| N valid WKdays | Number of valid week days | -| IS\_interdailystability | inter daily stability. The movement count that is derived for this was an attempt to follow the original approach by Eus J. W. Van Someren (Chronobiology International. 1999. Volume 16, issue 4). | -| IV_in tradailyvariability | intra daily variability. In contrast to the original paper, we ignore the epoch transitions between the end of a day and the beginning of the next day for the numerator of the equation, this to make it a true measure of intradaily variability. Same note as for IS: The movement count that is derived for this was an attempt to follow the original approach. | -| IVIS\_windowsize_minutes | Sizes of the windows based on which IV and IS are calculated (note that this is modifiable) | -| IVIS_epochsize_seconds | Argument has been deprecated | +| (Part of) variable name | Description | +|--------------------------|----------------------------------------------| +| ID | Participant id | +| device_sn | Device serial number | +| bodylocation | Body location extracted from file header | +| filename | Name of the data file | +| start_time | Timestamp when recording started | +| startday | Day of the week on which recording started | +| samplefreq | Sample frequency (Hz) | +| device | Accelerometer brand, e.g. GENEACtiv | +| clipping_score | The Clipping score: Fraction of 15 minute windows per file for which the acceleration in one of the three axis was close to the maximum for at least 80% of the time. This should be 0. | +| meas_dur_dys | Measurement duration (days) | +| complete_24hcycle | Completeness score: Fraction of 15 minute windows per 24 hours for which no valid data is available at any day of the measurement. | +| meas_dur_def_proto_day | measurement duration according to protocol (days): Measurement duration (days) minus the hours that are ignored at the beginning and end of the measurement motivated by protocol design | +| wear_dur_def_proto_day | wear duration duration according to protocol (days): So, if the protocol was seven days of measurement, then wearing the accelerometer for 8 days and recording data for 8 days will still make that the wear duration is 7 days | +| calib_err | Calibration error (static estimate) Estimated based on all 'non-movement' periods in the measurement after applying the autocalibration. | +| calib_status | Calibration status: Summary statement about the status of the calibration error minimisation | +| ENMO_fullRecordingMean | ENMO is the main summary measure of acceleration. The value presented is the average ENMO over all the available data normalised per 24-hour cycles (diurnal balanced), with invalid data imputed by the average at similar time points on different days of the week. In addition to ENMO it is possible to extract other acceleration metrics (i.e. BFEN, HFEN, HFENplus). We emphasize that it is calculated over the full recording because the alternative is that a variable is only calculated overmeasurement days with sufficient valid hours of data. | +| ENMO | (only available if set to true in part1.R) ENMO is the main summary measure of acceleration. The value presented is the average ENMO over all the available data normalised per 24 hour cycles, with invalid data imputed by the average at similar timepoints on different days of the week. In addition to ENMO it is possible to extract other acceleration metrics in part1.R (i.e. BFEN, HFEN, HFENplus) See also [van Hees PLoSONE April 2013](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0061691) for a detailed description and comparison of these techniques. | +| pX_A_mg_0-24h_fullRecording | This variable represents the Xth percentile in the distribution of short epoch metric value A of the average day. The average day may not be ideal for describing the distribution. Therefore, the code also extracts the following variable. | +| AD_pX_A_mg_0-24h | This variable represents the Xth percentile in the distribution of short epoch metric value A per day averaged across all days. | +| L5_A_mg_0-24 | Average of metric A during the least active five\* hours in the day that is the lowest rolling average value of metric A. (\* window size is modifiable by argument `winhr`) | +| M5_A_mg_0-24 | Average of metric A during the most active five\* hours in the day that is the lowest rolling average value of metric A. (\* window size is modifiable by argument `winhr`) | +| L5hr_A_mg_0-24 | Starting time in hours and fractions of hours of L5_A_mg_0-24, where hours below 12 are incremented with 24 to create a continuous scale throughout the night (e.g. 36 = 6am) in line with numeric timeing of sleep variables in GGIR part 4 output. | +| M5hr_A_mg_0-24 | Starting time in hours and fractions of hours of M5_A_mg_0-24 | +| ig_gradient_ENMO_0 -24hr_fullRecording | Intensity gradient calculated over the full recording. | +| 1to6am_ENMO_mg | Average metric value ENMO between 1am and 6am | +| N valid WEdays | Number of valid weekend days | +| N valid WKdays | Number of valid week days | +| IS | inter daily stability. The movement count that is derived for this was an attempt to follow the original approach by Eus J. W. Van Someren (Chronobiology International. 1999. Volume 16, issue 4). | +| IV | intra daily variability. In contrast to the original paper, we ignore the epoch transitions between the end of a day and the beginning of the next day for the numerator of the equation, this to make it a true measure of intradaily variability. Same note as for IS: The movement count that is derived for this was an attempt to follow the original approach. | +| IVIS_windowsize_minutes | Sizes of the windows based on which IV and IS are calculated (note that this is modifiable) | +| IVIS_epochsize_seconds | Argument has been deprecated | | `AD_` | All days (plain average of all available days, no weighting). The variable was calculated per day and then averaged over all the available days | -| `WE_` | Weekend days (plain average of all available days, no weighting). The variable was calculated per day and then averaged over weekend days only | -| `WD_` | Week days (plain average of all available days, no weighting). The variable was calculated per day and then averaged over week days only | -| `WWE_`| Weekend days (weighted average) The variable was calculated per day and then averaged over weekend days. Double weekend days are averaged. This is only relevant for experiments that last for more than seven days. | -| `WWD_`| Week days (weighted average) The variable was calculated per day and then averaged over week days. Double week days were averaged. This is only relevant for experiments that last for more than seven days) | -| WWD\_MVPA_E5S_T100_ENMO | Time spent in moderate-to-vigorous based on 5 second epoch size and an ENMO metric threshold of 100 | -| `WWE_MVPA_E5S_B1M80%_T100_ENMO`| Time spent in moderate-to-vigorous based on a bout criteria of at least 1 minute where 80% or more of the 5 second epochs are expected to meet the threshold criteria of of 100 mg based on acceleration metric ENMO (threshold is specified with parameter `mvpathreshold`) | -| `WE_[100, 150)_mg_0-24h_ENMO` | Time spent between (and including) 100 mg and 150 (excluding 150 itself) between 0 and 24 hours (the full day) using metric ENMO data exclusion data_masking_strategy (value=1, ignore specific hours; value=2, ignore all data before the first midnight and after the last midnight) | -| `_MVPA_E5S_B1M80_T100` | MVPA calculated based on 5 second epoch setting bout duration 1 Minute and inclusion criterion of more than 80 percent. | -| `_ENMO_mg` | ENMO or other metric was first calculated per day and then average according to AD, WD, WWE, WWD | -| data exclusion data_masking_strategy | A log of the decision made when calling g.impute: value=1 mean ignore specific hours; value=2 mean ignore all data before the first midnight and after the last midnight | -| n hours ignored at start of meas (if data_masking_strategy=1, 3 or 5) | number of hours ignored at the start of the measurement (if data_masking_strategy = 1) or at the start of the `ndayswindow` (if data_masking_strategy = 3 or 5) A log of decision made in part2.R | -| n hours ignored at end of meas (if data_masking_strategy=1, 3 or 5) | number of hours ignored at the end of the measurement (if data_masking_strategy = 1) or at the end of the `ndayswindow` (if data_masking_strategy = 3 or 5). A log of decision made in part2.R | -| n hours ignored at end of meas (if data_masking_strategy = 1, 3 or 5) | number of days of measurement after which all data is ignored (if data_masking_strategy = 1, 3 or 5) A log of decision made in part2.R | -| epoch size to which acceleration was averaged (seconds) | A log of decision made in part1.R | -| pdffilenumb | Indicator of in which pdf-file the plot was stored | -| pdfpagecount | Indicator of in which pdf-page the plot was stored| -| `cosinor_` | Cosinor analysis estimates such as mes, amp, acrophase, and acrotime, as documented in the [ActCR](https://CRAN.R-project.org/package=ActCR) package. | -| `cosinorExt_` | Extended Cosinor analysis estimates such as minimum, amp, alpha, beta, acrotime, UpMesor, DownMesor, MESOR, and F_pseudo, as documented in the [ActCR](https://CRAN.R-project.org/package=ActCR) package. | -| `cosinorIV` | Cosinor analysis compatible estimate of the Intradaily Variability (IV) | -| `cosinorIS` | Cosinor analysis compatible estimate of Interdaily Stability (IS) | +| `WE_` | Weekend days (plain average of all available days, no weighting). The variable was calculated per day and then averaged over weekend days only | +| `WD_` | Week days (plain average of all available days, no weighting). The variable was calculated per day and then averaged over week days only | +| `WWE_` | Weekend days (weighted average) The variable was calculated per day and then averaged over weekend days. Double weekend days are averaged. This is only relevant for experiments that last for more than seven days. | +| `WWD_` | Week days (weighted average) The variable was calculated per day and then averaged over week days. Double week days were averaged. This is only relevant for experiments that last for more than seven days) | +| WWD_MVPA_E5S_T100_ENMO | Time spent in moderate-to-vigorous based on 5 second epoch size and an ENMO metric threshold of 100 | +| `WWE_MVPA_E5S_B1M80%_T100_ENMO` | Time spent in moderate-to-vigorous based on a bout criteria of at least 1 minute where 80% or more of the 5 second epochs are expected to meet the threshold criteria of of 100 mg based on acceleration metric ENMO (threshold is specified with parameter `mvpathreshold`) | +| `WE_[100, 150)_mg_0-24h_ENMO` | Time spent between (and including) 100 mg and 150 (excluding 150 itself) between 0 and 24 hours (the full day) using metric ENMO data exclusion data_masking_strategy (value=1, ignore specific hours; value=2, ignore all data before the first midnight and after the last midnight) | +| `_MVPA_E5S_B1M80_T100` | MVPA calculated based on 5 second epoch setting bout duration 1 Minute and inclusion criterion of more than 80 percent. | +| `_ENMO_mg` | ENMO or other metric was first calculated per day and then average according to AD, WD, WWE, WWD | +| data exclusion data_masking_strategy | A log of the decision made when calling g.impute: value=1 mean ignore specific hours; value=2 mean ignore all data before the first midnight and after the last midnight | +| n hours ignored at start of meas (if data_masking_strategy=1, 3 or 5) | number of hours ignored at the start of the measurement (if data_masking_strategy = 1) or at the start of the `ndayswindow` (if data_masking_strategy = 3 or 5) A log of decision made in part2.R | +| n hours ignored at end of meas (if data_masking_strategy=1, 3 or 5) | number of hours ignored at the end of the measurement (if data_masking_strategy = 1) or at the end of the `ndayswindow` (if data_masking_strategy = 3 or 5). A log of decision made in part2.R | +| n hours ignored at end of meas (if data_masking_strategy = 1, 3 or 5) | number of days of measurement after which all data is ignored (if data_masking_strategy = 1, 3 or 5) A log of decision made in part2.R | +| epoch size to which acceleration was averaged (seconds) | A log of decision made in part1.R | +| pdffilenumb | Indicator of in which pdf-file the plot was stored | +| pdfpagecount | Indicator of in which pdf-page the plot was stored | +| `cosinor_` | Cosinor analysis estimates such as mes, amp, acrophase, and acrotime, as documented in the [ActCR](https://CRAN.R-project.org/package=ActCR) package. | +| `cosinorExt_` | Extended Cosinor analysis estimates such as minimum, amp, alpha, beta, acrotime, UpMesor, DownMesor, MESOR, and F_pseudo, as documented in the [ActCR](https://CRAN.R-project.org/package=ActCR) package. | +| `cosinorIV` | Cosinor analysis compatible estimate of the Intradaily Variability (IV) | +| `cosinorIS` | Cosinor analysis compatible estimate of Interdaily Stability (IS) | +| cosinor_timeOffsetHours | Offset between start time series as used and midnight | +| cosinor_mes | The mean value of the cosinor function fitted to the log transformed acceleration signal. Higher values indicate more activity. | +| cosinor_amp | The amplitude of the cosinor function fitted to the log transformed acceleration signal, corresponding to the peak of the function minus the mesor. Higher values indicate larger amplitude in the rhythm | +| cosinor_acrotime | Time at which the cosinor function fitted to the log transformed acceleration signal reaches its maximum | +| cosinor_acrophase | Acrotime expressed in radians | +| cosinor_ndays | Number of days used for the cosinor analysis. | +| cosinor_R2 | Measure of goodness of fit of the cosinor function to the log transformed acceleration signal. Higher values indicate better goodness of fit. | +| cosinorExt_minimum | (copied from ActCR documentation) Minimum value of the function | +| cosinorExt_amp | (copied from ActCR documentation) Amplitude, a measure of half the extend of predictable variation within a cycle. This represents the highest activity one can achieve. | +| cosinorExt_alpha | (copied from ActCR documentation) It determines whether the peaks of the curve are wider than the troughs: when alpha is small, the troughs are narrow and the peaks are wide; when alpha is large, the troughs are wide and the peaks are narrow. | +| cosinorExt_beta | (copied from ActCR documentation) It dertermines whether the transformed function rises and falls more steeply than the cosine curve: large values of beta produce curves that are nearly square waves. | +| cosinorExt_acrotime | (copied from ActCR documentation) acrophase is the time of day of the peak in the unit of the time (hours) | +| cosinorExt_UpMesor | (copied from ActCR documentation) Time of day of switch from low to high activity. Represents the timing of the rest- activity rhythm. Lower (earlier) values indicate increase in activity earlier in the day and suggest a more advanced circadian phase. | +| cosinorExt_DownMesor | (copied from ActCR documentation) Time of day of switch from high to low activity. Represents the timing of the rest-activity rhythm. Lower (earlier) values indicate decline in activity earlier in the day, suggesting a more advanced circadian phase. | +| cosinorExt_MESOR | (copied from ActCR documentation) A measure analogous to the MESOR of the cosine model (or half the deflection of the curve) can be obtained from mes=min+amp/2. However, it goes through the middle of the peak, and is therefore not equal to the MESOR of the cosine model, which is the mean of the data | +| cosinorExt_ndays | (copied from ActCR documentation) Number of days modeled. | +| cosinorExt_F_pseudo | (copied from ActCR documentation) Measure the improvement of the fit obtained by the non-linear estimation of the transformed cosine model | +| cosinorExt_R2 | Measure of goodness of fit of the extended cosinor function to the log transformed acceleration signal. Higher values indicate better goodness of fit. | +| phi | Indicator of auto-correlation in acceleration time series over multiple days, see [chapter 13](https://wadpac.github.io/GGIR/articles/chapter13_CircadianRhythm.html) for details. | +| SSP | Method for describing time series, see [chapter 13](https://wadpac.github.io/GGIR/articles/chapter13_CircadianRhythm.html) for details. | +| ABI | ABI measures how the activity over the observed period is balanced, see [chapter 13](https://wadpac.github.io/GGIR/articles/chapter13_CircadianRhythm.html) for details. | ## Day level summary (csv) -This is a non-exhaustive list, because most concepts have been explained -in summary.csv +This is a non-exhaustive list, because most concepts have been explained in summary.csv -| (Part of) variable name | Description | +| (Part of) variable name | Description | |--------------------------|----------------------------------------------| -| ID| Participant id | -| filename | Name of the data file | -| calender_date | Timestamp and date on which measurement started | -| bodylocation | Location of the accelerometer as extracted from file header | -| N valid hours | Number of hours with valid data in the day | -| N hours | Number of hours of measurement in a day, which typically is 24, unless it is a day on which the clock changes (DST) resulting in 23 or 25 hours. The value can be less than 23 if the measurement started or ended this day | -| weekday | Name of weekday | -| measurement | Day of measurement Day number relative to start of the measurement | -| L5hr_ENMO_mg_0-24h | Hour on which L5 starts for these 24 hours (defined with metric ENMO) | -| L5_ENMO_mg_0-24h | Average acceleration for L5 (defined with metric ENMO) | -| `[A,B)_mg_0-24h_ENMO` | Time spent in minutes between (and including) acceleration value A in mg and (excluding) acceleration value B in mg based on metric ENMO | -| ig\_gradient_ENMO_0-24hr | Gradient from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) based on metric ENMO for the time segment 0 to 24 hours | -| ig\_intercept_ENMO_0-24hr | Intercept from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) based on metric ENMO for the time segment 0 to 24 hours | -| ig\_rsquared_ENMO_0-24hr | r squared from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) based on metric ENMO for the time segment 0 to 24 hours | +| ID | Participant id | +| filename | Name of the data file | +| calender_date | Timestamp and date on which measurement started | +| bodylocation | Location of the accelerometer as extracted from file header | +| N valid hours | Number of hours with valid data in the day | +| N hours | Number of hours of measurement in a day, which typically is 24, unless it is a day on which the clock changes (DST) resulting in 23 or 25 hours. The value can be less than 23 if the measurement started or ended this day | +| weekday | Name of weekday | +| measurement | Day of measurement Day number relative to start of the measurement | +| L5hr_ENMO_mg_0-24h | Hour on which L5 starts for these 24 hours (defined with metric ENMO) | +| L5_ENMO_mg_0-24h | Average acceleration for L5 (defined with metric ENMO) | +| `[A,B)_mg_0-24h_ENMO` | Time spent in minutes between (and including) acceleration value A in mg and (excluding) acceleration value B in mg based on metric ENMO | +| ig_gradient_ENMO_0-24hr | Gradient from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) based on metric ENMO for the time segment 0 to 24 hours | +| ig_intercept_ENMO_0-24hr | Intercept from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) based on metric ENMO for the time segment 0 to 24 hours | +| ig_rsquared_ENMO_0-24hr | r squared from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) based on metric ENMO for the time segment 0 to 24 hours | ## Data_quality_report (csv) The data_quality_report.csv is stored in subfolder folder results/QC. -| (Part of) variable name | Description | +| (Part of) variable name | Description | |--------------------------|----------------------------------------------| -| filename | file name | -| file.corrupt | Is file corrupt? TRUE or FALSE (mainly tested for GENEActiv bin files) | -| file.too.short | File too short for processing? ([definition](#Minimum_recording_duration)) TRUE or FALSE | -| use.temperature | Temperature used for auto-calibration? TRUE or FALSE | -| scale.x | Auto-calibration scaling coefficient for x-axis (same for y and z axis, not shown here) | -| offset.x | Auto-calibration offset coefficient for x-axis (same for y and z axis, not shown here) | +| filename | file name | +| file.corrupt | Is file corrupt? TRUE or FALSE (mainly tested for GENEActiv bin files) | +| file.too.short | File too short for processing? ([definition](#Minimum_recording_duration)) TRUE or FALSE | +| use.temperature | Temperature used for auto-calibration? TRUE or FALSE | +| scale.x | Auto-calibration scaling coefficient for x-axis (same for y and z axis, not shown here) | +| offset.x | Auto-calibration offset coefficient for x-axis (same for y and z axis, not shown here) | | temperature.offset.x | Auto-calibration temperature offset coefficient for x-axis (same for y and z axis, not shown here) | -| cal.error.start | Calibration error prior to auto-calibration | -| cal.error.end | Calibration error after auto-calibration | -| n.10sec.windows | Number of 10 second epochs used as sphere data in auto-calibration | +| cal.error.start | Calibration error prior to auto-calibration | +| cal.error.end | Calibration error after auto-calibration | +| n.10sec.windows | Number of 10 second epochs used as sphere data in auto-calibration | | n.hours.considered | Number of hours of data considered for auto-calibration | -| QCmessage | Character QC message at the end of the auto-calibration | -| mean.temp | Mean temperature in sphere data | -| device.serial.number | Device serial number | -| NFilePagesSkipped | (Only for Axivity .cwa format) Number of raw data blocks skipped | +| QCmessage | Character QC message at the end of the auto-calibration | +| mean.temp | Mean temperature in sphere data | +| device.serial.number | Device serial number | +| NFilePagesSkipped | (Only for Axivity .cwa format) Number of raw data blocks skipped | | filehealth_totimp_min | (Only for Axivity .cwa, ActiGraph gt3x, and ad-hoc csv format) Total number of minutes of raw data imputed | -| filehealth_checksumfail_min | (Only for Axivity .cwa format) Total number of minutes of raw data where the checksum failed | -| filehealth_niblockid_min | (Only for Axivity .cwa format) Total number of minutes of raw data with non-incremental block ids | +| filehealth_checksumfail_min | (Only for Axivity .cwa format) Total number of minutes of raw data where the checksum failed | +| filehealth_niblockid_min | (Only for Axivity .cwa format) Total number of minutes of raw data with non-incremental block ids | | filehealth_fbias0510_min | (Only for Axivity .cwa format) Total number of minutes with a sampling frequency bias between 5 and 10% | | filehealth_fbias1020_min | (Only for Axivity .cwa format) Total number of minutes with a sampling frequency bias between 10 and 20% | -| filehealth_fbias2030_min | (Only for Axivity .cwa format) Total number of minutes with a sampling frequency bias between 20 and 30% | +| filehealth_fbias2030_min | (Only for Axivity .cwa format) Total number of minutes with a sampling frequency bias between 20 and 30% | | filehealth_fbias30_min | (Only for Axivity .cwa format) Total number of minutes with a sampling frequency bias higher than 30% | | filehealth_totimp_N | (Only for Axivity .cwa, ActiGraph gt3x, and ad-hoc csv format) Total number of data blocks that were imputed | -| filehealth_checksumfail_N | (Only for Axivity .cwa format) Total number of blocks where the checksum failed | -| filehealth_niblockid_N | (Only for Axivity .cwa format) Total number of raw data blocks with non-incremental block ids | +| filehealth_checksumfail_N | (Only for Axivity .cwa format) Total number of blocks where the checksum failed | +| filehealth_niblockid_N | (Only for Axivity .cwa format) Total number of raw data blocks with non-incremental block ids | | filehealth_fbias0510_N | (Only for Axivity .cwa format) Total number of raw data blocks with a sampling frequency bias between 5 and 10% | -| filehealth_fbias1020_N | (Only for Axivity .cwa format) Total number of raw data blocks with a sampling frequency bias between 10 and 20%| +| filehealth_fbias1020_N | (Only for Axivity .cwa format) Total number of raw data blocks with a sampling frequency bias between 10 and 20% | | filehealth_fbias2030_N | (Only for Axivity .cwa format) Total number of raw data blocks with a sampling frequency bias between 20 and 30% | | filehealth_fbias30_N | (Only for Axivity .cwa format) Total number of raw data blocks with a sampling frequency bias higher than 30% | @@ -177,125 +187,96 @@ Part 4 generates the following output: - part4_nightsummary_sleep_cleaned.csv - QC/part4_nightsummary_sleep_full.csv -The latter with '_full' in the name is intended to aid -clarifying why some nights (if any) are excluded from the cleaned summary report. -Although, nights where the accelerometer was not worn at all are excluded from this. So, -if you have a 30 day recording where the accelerometer was not worn from day 7 onward then -you will not find the last 22 nights in either csv-report. +The latter with '\_full' in the name is intended to aid clarifying why some nights (if any) are excluded from the cleaned summary report. Although, nights where the accelerometer was not worn at all are excluded from this. So, if you have a 30 day recording where the accelerometer was not worn from day 7 onward then you will not find the last 22 nights in either csv-report. The csv. files contain the variables as shown below. -| (Part of) variable name | Description | -|-------------------------|-----------------------------------------------| -| ID| Participant ID extracted from file | -| night | Number of the night in the recording | -| sleeponset | Detected onset of sleep expressed as hours since the midnight of the previous night. | -| wakeup | Detected waking time (after sleep period) expressed as hours since the midnight of the previous night. | -| SptDuration | Difference between onset and waking time. | -| sleepparam | Definition of sustained inactivity by accelerometer. | -| guider | guider used as discussed in paragraph [Sleep analysis](#Sleep_analysis). | -| guider_onset | Start of Sleep Period Time window derived from the guider.| -| guider_wake | End of Sleep Period Time window derived guider. | -| guider_SptDuration | Time SPT duration derived from guider_wake and guider_onset. | -| error_onset | Difference between sleeponset and guider_onset| -| error_wake | Difference between wakeup and guider_wake | -| fraction_night_invalid | Fraction of the night (noon-noon or 6pm-6pm) for which the data was invalid, e.g. monitor not worn or no accelerometer measurement started/ended within the night. | -| SleepDurationInSpt | Total sleep duration, which equals the accumulated nocturnal sustained inactivity bouts within the Sleep Period Time. | -| duration_sib_wakinghours | Accumulated sustained inactivity bouts during the day. These are the periods we would label during the night as sleep, but during the day they form a subclass of inactivity, which may represent day time sleep or wakefulness while being motionless for a sustained period of time number_sib_sleepperiod} Number of nocturnal sleep periods, with nocturnal referring to the Sleep Period Time window. | -| duration_sib_wakinghours_atleast15min | Same as duration_sib_wakinghours, but limited to SIBs that last at least 15 minutes. | -| num ber_sib_wakinghours | Number of sustained inactivity bouts during the day, with day referring to the time outside the Sleep Period Time window.| -| sleeponset_ts | sleeponset formatted as a timestamp | -| wakeup_ts | wakeup formatted as a timestamp | -| guider_onset_ts| guider_onset formatted as a timestamp | -| guider_wake_ts| guider_wake formatted as a timestamp | -| page | pdf page on which the visualisation can be found | -| daysleeper | If 0 then the person is a nightsleeper (sleep period did not overlap with noon) if value=1 then the person is a daysleeper (sleep period did overlap with noon) | -| weekday | Day of the week on which the night started | -| calendardate | Calendar date on which the night started in day/month/year format. | -| filename | Name of the accelerometer file | -| cleaningcode | see paragraph [Cleaningcode](https://wadpac.github.io/GGIR/articles/chapter10_SleepAnalysis.html#cleaningcode) | -| sleeplog_used | Whether a sleep log was used (TRUE/FALSE) | -| acc_available | Whether accelerometer data was available (TRUE/FALSE). | -| WASO | Wake After Sleep Onset: SptDuration - SleepDurationInSpt | -| SptDuration | Sleep Period Time window duration: wakeup - sleeponset | -| error_onset | Difference between sleeponset and guider_onset (this variable is only available in the full report as stored in the QC folder) | -| error_wake | Difference between wakeup and guider_wake (this variable is only available in the full report as stored in the QC folder)| -| SleepRegularityIndex | The Sleep Regularity Index as proposed by [Phillips et al. 2017](https://www.nature.com/articles/s41598-017-03171-4), but calculated per day-pair to enable user to study patterns across days | -| SriFractionValid | Fraction of the 24 hour period that was valid in both current as well as in matching timestamps for the next calendar day. See GGIR function manual for details | -| nonwear_perc_spt | Non-wear percentage during the spt hours of this day. This is a copy of the nonwear_perc_spt calculated in `part 5`, only included in part 4 reports if part 5 has been run with timewindow = WW | - +| (Part of) variable name | Description | +|--------------------------|----------------------------------------------| +| ID | Participant ID extracted from file | +| night | Number of the night in the recording | +| sleeponset | Detected onset of sleep expressed as hours since the midnight of the previous night. | +| wakeup | Detected waking time (after sleep period) expressed as hours since the midnight of the previous night. | +| SptDuration | Difference between onset and waking time. | +| sleepparam | Definition of sustained inactivity by accelerometer. | +| guider | guider used as discussed in paragraph [Sleep analysis](#Sleep_analysis). | +| guider_onset | Start of Sleep Period Time window derived from the guider. | +| guider_wake | End of Sleep Period Time window derived guider. | +| guider_SptDuration | Time SPT duration derived from guider_wake and guider_onset. | +| error_onset | Difference between sleeponset and guider_onset | +| error_wake | Difference between wakeup and guider_wake | +| fraction_night_invalid | Fraction of the night (noon-noon or 6pm-6pm) for which the data was invalid, e.g. monitor not worn or no accelerometer measurement started/ended within the night. | +| SleepDurationInSpt | Total sleep duration, which equals the accumulated nocturnal sustained inactivity bouts within the Sleep Period Time. | +| duration_sib_wakinghours | Accumulated sustained inactivity bouts during the day. These are the periods we would label during the night as sleep, but during the day they form a subclass of inactivity, which may represent day time sleep or wakefulness while being motionless for a sustained period of time number_sib_sleepperiod} Number of nocturnal sleep periods, with nocturnal referring to the Sleep Period Time window. | +| duration_sib_wakinghours_atleast15min | Same as duration_sib_wakinghours, but limited to SIBs that last at least 15 minutes. | +| num ber_sib_wakinghours | Number of sustained inactivity bouts during the day, with day referring to the time outside the Sleep Period Time window. | +| sleeponset_ts | sleeponset formatted as a timestamp | +| wakeup_ts | wakeup formatted as a timestamp | +| guider_onset_ts | guider_onset formatted as a timestamp | +| guider_wake_ts | guider_wake formatted as a timestamp | +| page | pdf page on which the visualisation can be found | +| daysleeper | If 0 then the person is a nightsleeper (sleep period did not overlap with noon) if value=1 then the person is a daysleeper (sleep period did overlap with noon) | +| weekday | Day of the week on which the night started | +| calendardate | Calendar date on which the night started in day/month/year format. | +| filename | Name of the accelerometer file | +| cleaningcode | see paragraph [Cleaningcode](https://wadpac.github.io/GGIR/articles/chapter10_SleepAnalysis.html#cleaningcode) | +| sleeplog_used | Whether a sleep log was used (TRUE/FALSE) | +| acc_available | Whether accelerometer data was available (TRUE/FALSE). | +| WASO | Wake After Sleep Onset: SptDuration - SleepDurationInSpt | +| SptDuration | Sleep Period Time window duration: wakeup - sleeponset | +| error_onset | Difference between sleeponset and guider_onset (this variable is only available in the full report as stored in the QC folder) | +| error_wake | Difference between wakeup and guider_wake (this variable is only available in the full report as stored in the QC folder) | +| SleepRegularityIndex | The Sleep Regularity Index as proposed by [Phillips et al. 2017](https://www.nature.com/articles/s41598-017-03171-4), but calculated per day-pair to enable user to study patterns across days | +| SriFractionValid | Fraction of the 24 hour period that was valid in both current as well as in matching timestamps for the next calendar day. See GGIR function manual for details | +| nonwear_perc_spt | Non-wear percentage during the spt hours of this day. This is a copy of the nonwear_perc_spt calculated in `part 5`, only included in part 4 reports if part 5 has been run with timewindow = WW | **Non-default variables in part 4 csv report** -These additional are only stored if you used a sleeplog that captures -time in bed, or when using guider HorAngle for hip-worn accelerometer -data. If either of these applies set argument `sleepwindowType` to -"TimeInBed". +These additional are only stored if you used a sleeplog that captures time in bed, or when using guider HorAngle for hip-worn accelerometer data. If either of these applies set argument `sleepwindowType` to "TimeInBed". -| (Part of) variable name | Description | -|-------------------------|-----------------------------------------------| -| guider_guider_inbedStart | Time of getting in bed | -| guider_guider_inbedEnd | Time of getting out of bed| -| guider_inbedDuration | Time in Bed: guider_inbedEnd - guider_inbedStart | -| sleepefficiency | Sleep efficiency, calculated by one of two metrics as controlled by argument `sleepefficiency.metric`: SleepDurationInSpt / guider_inbedDuration (default) or SleepDurationInSpt / (SptDuration + latency) | -| sleeplatency | Sleep latency, calculated as: sleeponset - guider_inbedStart | +| (Part of) variable name | Description | +|--------------------------|----------------------------------------------| +| guider_guider_inbedStart | Time of getting in bed | +| guider_guider_inbedEnd | Time of getting out of bed | +| guider_inbedDuration | Time in Bed: guider_inbedEnd - guider_inbedStart | +| sleepefficiency | Sleep efficiency, calculated by one of two metrics as controlled by argument `sleepefficiency.metric`: SleepDurationInSpt / guider_inbedDuration (default) or SleepDurationInSpt / (SptDuration + latency) | +| sleeplatency | Sleep latency, calculated as: sleeponset - guider_inbedStart | ## Person level summaries (csv) - part4_summary_sleep_cleaned.csv - QC/part4_summary_sleep_full.csv -In the person level report the variables are derived from the variables -in the night level summary. Minor extensions to the variable names -explain how variables are aggregated across the days. Please find below -extra clarification on a few of the variable names for which the meaning -may not be obvious: - -| (Part of) variable name | Description | -|-------------------------|-----------------------------------------------| -| `_mn`| mean across days| -| `_sd`| standard deviation across days | -| `_AD`| All days | -| `_WE`| Weekend days | -| `_WD`| Week days | -| sleeplog_used | Whether a sleeplog was available (TRUE) or not (FALSE)| -| sleep_efficiency_after_onset | Accelerometer derived sleep efficiency within the sleep period time calculated as the ratio between acc_SleepDurationInSpt and guider_SptDuration (denominator) or acc_SleepDurationInSpt and acc_SptDuration + latency (denominator), as defined with sleepefficiency.metric. Only available at person level, because at night level the user can calculate this from existing variables. | -| n_nights_acc | Number of nights of accelerometer data | -| n_nights_sleeplog | Number of nights of sleeplog data. | -| n\_WE_nights_complete | Number of weekend nights complete which means both accelerometer and estimate from guider. | -| n\_WD_nights_complete | Number of weekday nights complete which means both accelerometer and estimate from guider. | -| n\_WEnights_daysleeper | Number of weekend nights on which the person slept until after noon. | -| n\_WDnights_daysleeper | Number of weekday nights on which the person slept until after noon. | -| duration_sib_wakinghour | Total duration of sustained inactivity bouts during the waking hours. | -| number_sib_wakinghours | Number of sustained inactivity bouts during the waking hours. | -| average\_dur_sib_wakinghours | Average duration of the sustained inactivity bouts during the day (outside the sleep period duration). Calculated as duration_sib_wakinghour divided by number_sib_wakinghours per day, after which the mean and standard deviation are calculated across days. | +In the person level report the variables are derived from the variables in the night level summary. Minor extensions to the variable names explain how variables are aggregated across the days. Please find below extra clarification on a few of the variable names for which the meaning may not be obvious: + +| (Part of) variable name | Description | +|--------------------------|----------------------------------------------| +| `_mn` | mean across days | +| `_sd` | standard deviation across days | +| `_AD` | All days | +| `_WE` | Weekend days | +| `_WD` | Week days | +| sleeplog_used | Whether a sleeplog was available (TRUE) or not (FALSE) | +| sleep_efficiency_after_onset | Accelerometer derived sleep efficiency within the sleep period time calculated as the ratio between acc_SleepDurationInSpt and guider_SptDuration (denominator) or acc_SleepDurationInSpt and acc_SptDuration + latency (denominator), as defined with sleepefficiency.metric. Only available at person level, because at night level the user can calculate this from existing variables. | +| n_nights_acc | Number of nights of accelerometer data | +| n_nights_sleeplog | Number of nights of sleeplog data. | +| n_WE_nights_complete | Number of weekend nights complete which means both accelerometer and estimate from guider. | +| n_WD_nights_complete | Number of weekday nights complete which means both accelerometer and estimate from guider. | +| n_WEnights_daysleeper | Number of weekend nights on which the person slept until after noon. | +| n_WDnights_daysleeper | Number of weekday nights on which the person slept until after noon. | +| duration_sib_wakinghour | Total duration of sustained inactivity bouts during the waking hours. | +| number_sib_wakinghours | Number of sustained inactivity bouts during the waking hours. | +| average_dur_sib_wakinghours | Average duration of the sustained inactivity bouts during the day (outside the sleep period duration). Calculated as duration_sib_wakinghour divided by number_sib_wakinghours per day, after which the mean and standard deviation are calculated across days. | ## Visualisation (pdf) -Visualisation to support data quality checks: - visualisation_sleep.pdf -(optional) - -When input argument `do.visual` is set to TRUE GGIR can show the -following visual comparison between the time window of being asleep (or -in bed) according to the sleeplog and the detected sustained inactivity -bouts according to the accelerometer data. This visualisation is stored -in the results folder as `visualisation_sleep.pdf`. - -Explanation of the image: Each line represents one night. Colours are -used to distinguish definitions of sustained inactivity bouts (2 -definitions in this case) and to indicate existence or absence of -overlap with the sleeplog. When argument `outliers.only` is set to FALSE -it will visualise all available nights in the dataset. If -`outliers.only` is set to TRUE it will visualise only nights with a -difference in onset or waking time between sleeplog and sustained -inactivity bouts larger than the value of argument `criterror`. - -This visualisation with outliers.only set to TRUE and critererror set to -4 was very powerful to identify entry errors in sleeplog data in van -Hees et al PLoSONE 2015. We had over 25 thousand nights of data, and -this visualisation allowed us to quickly zoom in on the most problematic -nights to investigate possible mistakes in GGIR or mistakes in data -entry. +Visualisation to support data quality checks: - visualisation_sleep.pdf (optional) + +When input argument `do.visual` is set to TRUE GGIR can show the following visual comparison between the time window of being asleep (or in bed) according to the sleeplog and the detected sustained inactivity bouts according to the accelerometer data. This visualisation is stored in the results folder as `visualisation_sleep.pdf`. + +Explanation of the image: Each line represents one night. Colours are used to distinguish definitions of sustained inactivity bouts (2 definitions in this case) and to indicate existence or absence of overlap with the sleeplog. When argument `outliers.only` is set to FALSE it will visualise all available nights in the dataset. If `outliers.only` is set to TRUE it will visualise only nights with a difference in onset or waking time between sleeplog and sustained inactivity bouts larger than the value of argument `criterror`. + +This visualisation with outliers.only set to TRUE and critererror set to 4 was very powerful to identify entry errors in sleeplog data in van Hees et al PLoSONE 2015. We had over 25 thousand nights of data, and this visualisation allowed us to quickly zoom in on the most problematic nights to investigate possible mistakes in GGIR or mistakes in data entry. ```{r, out.width = "700px",echo=FALSE, fig.alt="Example visualisation GGIR part 4"} knitr::include_graphics("example_dovisual.jpg") @@ -303,83 +284,78 @@ knitr::include_graphics("example_dovisual.jpg") # GGIR Part 5 -The output of part 5 is dependent on the parameter configuration, it -will generate as many output files as there are unique combination of -the three thresholds provided. +The output of part 5 is dependent on the parameter configuration, it will generate as many output files as there are unique combination of the three thresholds provided. -For example, the following files will be generated if the threshold -configuration was 30 for light activity, 100 for moderate and 400 for -vigorous activity: +For example, the following files will be generated if the threshold configuration was 30 for light activity, 100 for moderate and 400 for vigorous activity: -- part5_daysummary_MM_L30M100V400_T5A5.csv +- part5_daysummary_MM_L30M100V400_T5A5.csv - part5_daysummary_WW_L30M100V400_T5A5.csv - part5_personsummary_MM_L30M100V400_T5A5.csv - part5_personsummary_WW_L30M100V400_T5A5.csv - file summary reports/Report_nameofdatafile.pdf - ## Day level summary (csv) -| (Term in) variable name | Description | +| (Term in) variable name | Description | |--------------------------|----------------------------------------------| -| sleeponset | onset of sleep expressed in hours since the midnight in the night preceding the night of interest, e.g. 26 is 2am. | -| wakeup | waking up time express in the same way as sleeponset. | -| sleeponset_ts | onset of sleep expressed as a timestamp hours:minutes:seconds | -| daysleeper | if 0 then the person woke up before noon, if 1 then the person woke up after noon | -| cleaningcode| See paragraph [Cleaningcode](https://wadpac.github.io/GGIR/articles/chapter10_SleepAnalysis.html#cleaningcode). | -| dur_day_spt_min | Total length of daytime waking hours and spt combined (typically 24 hours for MM report). | -| `dur_` | duration of a behavioral class that will be specified int he rest of the variable name | -| `ACC_` | (average) acceleration according to default metric specific by acc.metric| -| `_spt_wake_`| Wakefulness within the Sleep period time window.| -| `_spt_sleep_` | Sleep within the Sleep period time window. | -| `_IN_` | Inactivity. Note that we use the term inactivity instead of sedentary behaviour for the lowest intensity level of behaviour. The reason for this is that GGIR does not attempt to classifying the activity type sitting at the moment, by which we feel that using the term sedentary behaviour would fail to communicate that.| -| `_LIG_` | Light activity | -| `_MOD_` | Moderate activity | -| `_VIG_` | Vigorous activity | -| `_MVPA_` | Moderate or Vigorous activity | -| `_unbt_` | Unbouted | -| `_bts_` | Bouts (also known as sojourns), which are segments that for which the acceleration is within a specified range for a specified fraction of the time.| -| `_bts_1_10_`| Bouts lasting at least 1 minute and less than 10 minutes (1 and 9.99 minutes are included, but 10 minutes is not). | -| Nblock | number of blocks of a certain behavioral class, not these are not bouts but a count of the number of times the behavioral class occurs without interruptions. | -| WW | in filename refers to analyses based on the timewindow from waking to waking up | -| MM | in filename refers to analyses done on windows between midnight and midnight | -| calendar_date | calendar date on which the window started in day/month/year format. So, for WW window this could mean that you have two windows starting on the same date. | -| weekday | weekday on which the window started. So, for WW window this could mean that you have two windows starting on the weekday. | -| `_total_IN` | total time spent in inactivity (no distinction between bouted or unbouted behavior, this is a simple count of the number of epochs that meet the threshold criteria. | -| `_total_LIG`| total time spent in light activity.| -| nonwear_perc_day | Non-wear percentage during the waking hours of this day. | -| nonwear_perc_spt | Non-wear percentage during the spt hours of this day. | -| nonwear_perc_day_spt | Non-wear percentage during the whole day, including waking and spt. | -| dur_day_min | Duration of waking hours within this day window| -| dur_spt_min | Duration of Sleep Period Time within this day window. | -| dur_day_spt_min | Duration this day window, including both waking hours and SPT. | -| sleep_efficiency_after_onset | sleep_efficiency_after_onset in part 5 is not the same as in part 4, but calculated as the percentage of sleep within the sleep period time window. The conventional approach is the approach used in part 4. | -| L5TIME | Timing of least active 5hrs, expressed as timestamp in the day | -| M5TIME | Timing of most active 5hrs | -| L5TIME_num, M5TIME_num | Timing of least/most active 5hrs, expressed as hours in the day. Note that L5/M5 timing variables are difficult to average across days because 23:00 and 1:00 would average to noon and not to midnight. So, caution is needed when interpreting person averages. | -| L5VALUE | Acceleration value for least active 5hrs | -| M5VALUE | Acceleration value for most active 5hrs | -| `ig_` | All variables related to intensity gradient analysis | -| `_gradient` | Gradient from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) for the waking hours window (`_day_`) and for the full window (`_day_spt_`) | -| `_intercept` | Intercept from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) for the waking hours window (`_day_`) and for the full window (`_day_spt_`) | -| `_rsquared` | r squared from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) for the waking hours window (`_day_`) and for the full window (`_day_spt_`) | -| `FRAG_` | All variables related to behavioural fragmentation analysis | -| `TP_` | Transition probability| -| PA2IN | Physical activity fragments followed by inactivity fragments| -| IN2PA | Physical inactivity fragments followed by activity fragments| -| Nfrag | Number of fragments | -| IN2LIPA | Inactivity fragments followed by LIPA | -| IN2MVPA | Inactivity fragments followed by MVPA | -| `mean_dur` | mean duration of a fragment category | -| `Gini_dur` | Gini index| -| `CoV_dur` | Coefficient of Variation | -| alpha | Power law exponent | -| `x0.5` | Derived from power law exponent alpha, see [Chastin et al. 201 0](https://doi.org/10.1016/j.gaitpost.2009.09.002) | -| `W0.5` | Derived from power law exponent alpha, see [Chastin et al. 201 0](https://doi.org/10.1016/j.gaitpost.2009.09.002) | -| nap_count | Total number of naps, only calculated when argument do.sibreport = TRUE, currently optimised for 3.5-year olds. See function documentation for function `g.part5.classifyNaps` in the [GGIR function documentation (pdf)](https:/CRAN.R-project.org/package=GGIR/GGIR.pdf). | -| nap_totalduration | Total nap duration, only calculated when argument do.sibreport = TRUE, currently optimised for 3.5-year old. See function documentation for function `g.part5.classifyNaps` in the [GGIR function documentation (pdf)](https:/CRAN.R-project.org/package=GGIR/GGIR.pdf). | -| sibreport_n_items | Only created if `do.sibreport = TRUE`. Number of items in the sibreport | -| sibreport_n_items_day | Only created if `do.sibreport = TRUE`. Number of items in the sibreport for this specific day | +| sleeponset | onset of sleep expressed in hours since the midnight in the night preceding the night of interest, e.g. 26 is 2am. | +| wakeup | waking up time express in the same way as sleeponset. | +| sleeponset_ts | onset of sleep expressed as a timestamp hours:minutes:seconds | +| daysleeper | if 0 then the person woke up before noon, if 1 then the person woke up after noon | +| cleaningcode | See paragraph [Cleaningcode](https://wadpac.github.io/GGIR/articles/chapter10_SleepAnalysis.html#cleaningcode). | +| dur_day_spt_min | Total length of daytime waking hours and spt combined (typically 24 hours for MM report). | +| `dur_` | duration of a behavioral class that will be specified int he rest of the variable name | +| `ACC_` | (average) acceleration according to default metric specific by acc.metric | +| `_spt_wake_` | Wakefulness within the Sleep period time window. | +| `_spt_sleep_` | Sleep within the Sleep period time window. | +| `_IN_` | Inactivity. Note that we use the term inactivity instead of sedentary behaviour for the lowest intensity level of behaviour. The reason for this is that GGIR does not attempt to classifying the activity type sitting at the moment, by which we feel that using the term sedentary behaviour would fail to communicate that. | +| `_LIG_` | Light activity | +| `_MOD_` | Moderate activity | +| `_VIG_` | Vigorous activity | +| `_MVPA_` | Moderate or Vigorous activity | +| `_unbt_` | Unbouted | +| `_bts_` | Bouts (also known as sojourns), which are segments that for which the acceleration is within a specified range for a specified fraction of the time. | +| `_bts_1_10_` | Bouts lasting at least 1 minute and less than 10 minutes (1 and 9.99 minutes are included, but 10 minutes is not). | +| Nblock | number of blocks of a certain behavioral class, not these are not bouts but a count of the number of times the behavioral class occurs without interruptions. | +| WW | in filename refers to analyses based on the timewindow from waking to waking up | +| MM | in filename refers to analyses done on windows between midnight and midnight | +| calendar_date | calendar date on which the window started in day/month/year format. So, for WW window this could mean that you have two windows starting on the same date. | +| weekday | weekday on which the window started. So, for WW window this could mean that you have two windows starting on the weekday. | +| `_total_IN` | total time spent in inactivity (no distinction between bouted or unbouted behavior, this is a simple count of the number of epochs that meet the threshold criteria. | +| `_total_LIG` | total time spent in light activity. | +| nonwear_perc_day | Non-wear percentage during the waking hours of this day. | +| nonwear_perc_spt | Non-wear percentage during the spt hours of this day. | +| nonwear_perc_day_spt | Non-wear percentage during the whole day, including waking and spt. | +| dur_day_min | Duration of waking hours within this day window | +| dur_spt_min | Duration of Sleep Period Time within this day window. | +| dur_day_spt_min | Duration this day window, including both waking hours and SPT. | +| sleep_efficiency_after_onset | sleep_efficiency_after_onset in part 5 is not the same as in part 4, but calculated as the percentage of sleep within the sleep period time window. The conventional approach is the approach used in part 4. | +| L5TIME | Timing of least active 5hrs, expressed as timestamp in the day | +| M5TIME | Timing of most active 5hrs | +| L5TIME_num, M5TIME_num | Timing of least/most active 5hrs, expressed as hours in the day. Note that L5/M5 timing variables are difficult to average across days because 23:00 and 1:00 would average to noon and not to midnight. So, caution is needed when interpreting person averages. | +| L5VALUE | Acceleration value for least active 5hrs | +| M5VALUE | Acceleration value for most active 5hrs | +| `ig_` | All variables related to intensity gradient analysis | +| `_gradient` | Gradient from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) for the waking hours window (`_day_`) and for the full window (`_day_spt_`) | +| `_intercept` | Intercept from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) for the waking hours window (`_day_`) and for the full window (`_day_spt_`) | +| `_rsquared` | r squared from intensity gradient analysis proposed by [Rowlands et al. 2018](https://journals.lww.com/acsm-msse/Fulltext/2018/06000/Beyond_Cut_%20Points__Accelerometer_Metrics_that.25.aspx) for the waking hours window (`_day_`) and for the full window (`_day_spt_`) | +| `FRAG_` | All variables related to behavioural fragmentation analysis | +| `TP_` | Transition probability | +| PA2IN | Physical activity fragments followed by inactivity fragments | +| IN2PA | Physical inactivity fragments followed by activity fragments | +| Nfrag | Number of fragments | +| IN2LIPA | Inactivity fragments followed by LIPA | +| IN2MVPA | Inactivity fragments followed by MVPA | +| `mean_dur` | mean duration of a fragment category | +| `Gini_dur` | Gini index | +| `CoV_dur` | Coefficient of Variation | +| alpha | Power law exponent | +| `x0.5` | Derived from power law exponent alpha, see [Chastin et al. 201 0](https://doi.org/10.1016/j.gaitpost.2009.09.002) | +| `W0.5` | Derived from power law exponent alpha, see [Chastin et al. 201 0](https://doi.org/10.1016/j.gaitpost.2009.09.002) | +| nap_count | Total number of naps, only calculated when argument do.sibreport = TRUE, currently optimised for 3.5-year olds. See function documentation for function `g.part5.classifyNaps` in the [GGIR function documentation (pdf)](https:/CRAN.R-project.org/package=GGIR/GGIR.pdf). | +| nap_totalduration | Total nap duration, only calculated when argument do.sibreport = TRUE, currently optimised for 3.5-year old. See function documentation for function `g.part5.classifyNaps` in the [GGIR function documentation (pdf)](https:/CRAN.R-project.org/package=GGIR/GGIR.pdf). | +| sibreport_n_items | Only created if `do.sibreport = TRUE`. Number of items in the sibreport | +| sibreport_n_items_day | Only created if `do.sibreport = TRUE`. Number of items in the sibreport for this specific day | | nbouts_day_X | Only created if `do.sibreport = TRUE`. Number of bouts in a day of X where X can be sib (sustained inactivity bout), srnap (self-reported nap) or srnonw (self-reported nonwear) | | noverl_X | Only created if `do.sibreport = TRUE`. Number of overlapping bouts in a day of X where X can be sib_srnap, sib_srnonw, srnap_sib, or srnonw_sib | | frag_mean_dur_X_day | Only created if `do.sibreport = TRUE`. Mean duration of X per day, where X can be sib, srnap or srnonw | @@ -388,40 +364,25 @@ vigorous activity: | tdur_X_overl_Y | Only created if `do.sibreport = TRUE`. Total duration in minutes of the overlap between X and Y, which are combinations of sib, srnap or srnonw | | perc_X_overl_Y | Only created if `do.sibreport = TRUE`. Percentage of overlap between X and Y, which are combinations of sib, srnap or srnonw | - **Special note if you are working on compositional data analysis:** -The duration of all `dur_` variables that have `_total_` in their name -should add up to the total length of the waking hours in a day. -Similarly, the duration of all other `dur_` variables excluding the -variables `_total_` in their name and excluding the variable with -`dur_day_min`, `dur_spt_min`, and `dur_day_spt_min` should also add up -to the length of the full day. +The duration of all `dur_` variables that have `_total_` in their name should add up to the total length of the waking hours in a day. Similarly, the duration of all other `dur_` variables excluding the variables `_total_` in their name and excluding the variable with `dur_day_min`, `dur_spt_min`, and `dur_day_spt_min` should also add up to the length of the full day. **Motivation for default boutcriter.in = 0.9:** -The idea is that if you allow for bouts of 30 minutes it would not make -sense to allow for breaks of 20 percent (6 minutes!) this is why I used -a more stringent criteria for the highest category. Please note that you -can change these criteria via arguments `boutcriter.mvpa`, -`boutcriter.in`, and `boutcriter.lig`. +The idea is that if you allow for bouts of 30 minutes it would not make sense to allow for breaks of 20 percent (6 minutes!) this is why I used a more stringent criteria for the highest category. Please note that you can change these criteria via arguments `boutcriter.mvpa`, `boutcriter.in`, and `boutcriter.lig`. ## Person level summary (csv) -Most variables in the person level summary are derived from the day -level summary, but extended with `_pla` to indicate that the variable -was calculated as the plain average across all valid days. Variables -extended with `_wei` represent the weighted average of across all days -where weekend days always weighted 2/5 relative to the contribution of -week days. +Most variables in the person level summary are derived from the day level summary, but extended with `_pla` to indicate that the variable was calculated as the plain average across all valid days. Variables extended with `_wei` represent the weighted average of across all days where weekend days always weighted 2/5 relative to the contribution of week days. -| Variable name | Description | +| Variable name | Description | |--------------------------|----------------------------------------------| -| Nvaliddays | Total number of valid days. | -| Nvaliddays_WD | Number of valid week days. | -| Nvaliddays_WE | Number of valid weekend days, where the days that start on Saturday or Sunday are considered weekend. | -| NcleaningcodeX | Number of days that had cleaning code X for the corresponding sleep analysis in part 4. In case of MM analysis this refers to the night at the end of the day. | -| Nvaliddays_AL10F_WD | Number of valid week days with at least 10 fragments (5 inactivity or 5 inactive) | -| Nvaliddays_AL10F_WE | Number of valid weekend days with at least 10 fragments (5 inactivity or 5 inactive)| -| `_wei` | weighted average of weekend and week days, using a 2/5 ratio, see above.| -| `_pla` | plain average of all days, see above \ No newline at end of file +| Nvaliddays | Total number of valid days. | +| Nvaliddays_WD | Number of valid week days. | +| Nvaliddays_WE | Number of valid weekend days, where the days that start on Saturday or Sunday are considered weekend. | +| NcleaningcodeX | Number of days that had cleaning code X for the corresponding sleep analysis in part 4. In case of MM analysis this refers to the night at the end of the day. | +| Nvaliddays_AL10F_WD | Number of valid week days with at least 10 fragments (5 inactivity or 5 inactive) | +| Nvaliddays_AL10F_WE | Number of valid weekend days with at least 10 fragments (5 inactivity or 5 inactive) | +| `_wei` | weighted average of weekend and week days, using a 2/5 ratio, see above. | +| `_pla` | plain average of all days, see above | diff --git a/vignettes/chapter0_Contributing.Rmd b/vignettes/chapter0_Contributing.Rmd index 651d47580..65fa3bd16 100644 --- a/vignettes/chapter0_Contributing.Rmd +++ b/vignettes/chapter0_Contributing.Rmd @@ -32,4 +32,5 @@ You might not have the coding skills to contribute to the code base of GGIR, and - **Apply for funding** to support the development and maintenance of GGIR. GGIR is free software by which we entirely depend on users applying for funding to sponsor our efforts. Funding could be used to support the development of new functionalities, to support improvement of the existing GGIR software code, or to support development of better open-access training materials such as instruction videos. - **Report your issues and questions** in the [GGIR google group](https://groups.google.com/g/RpackageGGIR). - **Proofread the GGIR documentation** and inform us if you miss something or if you found it difficult to follow. +- **Take independent initiatives to complement our efforts**. For example, Prof. Stuart Fairclough created a [series of GGIR video tutorials](https://www.youtube.com/watch?v=AbHgJYTyuiA), Wei Guo and colleagues created the R package [mMARCH.AC](https://doi.org/10.32614/CRAN.package.mMARCH.AC) to post-process GGIR output as described in the supporting [journal article](https://doi.org/10.1123/jmpb.2022-0018), and maybe there are other initiatives we are not aware of. Please communicate about your initiatives via the [GGIR google group](https://groups.google.com/g/rpackageggir/) and/or with GGIR maintainer [Vincent van Hees](https://www.accelting.com/) to avoid duplicated work and to support each other where needed. diff --git a/vignettes/chapter11_DescribingDataCutPoints.Rmd b/vignettes/chapter11_DescribingDataCutPoints.Rmd index f636ef524..5e906d630 100644 --- a/vignettes/chapter11_DescribingDataCutPoints.Rmd +++ b/vignettes/chapter11_DescribingDataCutPoints.Rmd @@ -14,183 +14,125 @@ knitr::opts_chunk$set( ) ``` -Descriptive variables such as average acceleration as discussed in the previous chapter are powerful indicators of physical activity. Extensive evidence exists on the association between magnitude of acceleration as captured by an accelerometer and physical activity related energy expenditure as measured by indirect calorimetry. The value of MX metrics and intensity gradient as discussed in chapter 7 has been shown in several studies (for example: [Rowlands, 2019a](https://pubmed.ncbi.nlm.nih.gov/31318713/); [Rowlands 2019b](https://pubmed.ncbi.nlm.nih.gov/31808014/)). However, historically the physical activity research community has been keen on having a single or a small number of measures of physical activity that can be expressed in the unit minutes per day. +## Why behavioural classes? -# MVPA, LIPA, and Inactivity +Descriptive variables such as average acceleration per day or recording as discussed in the previous chapter are powerful indicators of physical activity. Extensive evidence exists on their association with physical activity related energy expenditure as measured by indirect calorimetry. Similarly, the scientific value of MX metrics and intensity gradient as discussed in chapter 7 has been shown in several studies (for example: [Rowlands, 2019a](https://pubmed.ncbi.nlm.nih.gov/31318713/); [Rowlands 2019b](https://pubmed.ncbi.nlm.nih.gov/31808014/)). However, the physical activity research community has historically expressed a strong interest in measures of physical activity that can be expressed as time per behavioural class per day. In this chapter we will discuss how GGIR facilitates this. ## Construct definition -A popular approach to describe time series data in physical activity research is to distinguish so called intensity levels. Here, the community distinguishes sedentary behaviour (SB), light physical activity (LIPA), moderate, and vigorous physical activity. The latter two categories are often combined into moderate or vigorous physical activity (MVPA). +A popular approach to define behavioural classes in physical activity research is to distinguish so called intensity levels. Here, it is common to distinguish sedentary behaviour (SB), light physical activity (LIPA), moderate, and vigorous physical activity. The latter two categories are often combined into moderate or vigorous physical activity (MVPA). Inside GGIR we refer to sedentary behaviour as inactivity to emphasize that our methods quantify mainly a lack of activity rather than being in sitting or reclying posture. -The exact definition of these levels is poor which has caused methodological discrepancies for decades: +However, intensity levels as behavioural classes lack a feasible operational construct definition which has caused methodological discrepancies for decades. An elaborate reflection on this can be found in [this blog post](https://www.accelting.com/updates/why-does-ggir-facilitate-cut-points/). -- Intensity levels are defined based on metabolic equality of task (MET), which is based on oxygen consumption measured with indirect calorimetry divided by body weight. MET as a construct fail to normalise for body weight by which comparisons of individuals or activity types that have a different relation with body weight are bound to drive differences. +This situation has forced us to be pragmatic and use a operational construct definition of intensity levels that is feasible for accelerometer data. -- The MET thresholds themselves (e.g. 3 MET) has a different meaning per person as MET value is not only driven by what people do in terms of movement but also their fitness and anthropometry. +## Classification based on cut-points -- Lack of consensus on how to handle the temporal nature of behaviour. For example, is a few seconds enough to define a behaviour or do we require minimum time duration? +It is common to classify intensity levels from accelerometer data by evaluating whether the acceleration is below, above, or between certain acceleration level(s). The acceleration magnitude(s) to use as threshold(s), are also known as cut-points. -- Criterion methods are hard to standardise across studies. For example, the studies differ in specific equipment, study protocols, and preparation of the data from the criterion method. +The use of thresholds (cut-points) is intended to be a crude indicator of time spent in intensity levels and sufficient to rank individuals on their amount of time spent in these behaviours. The cut-point approach has indisputably been the most powerful method so far to drive physical activity research. -## Classification with accelerometer data +See [published cut-points and how to use them](https://wadpac.github.io/GGIR/articles/CutPoints.html) as guidance for choosing cut-points for your dataset. -From an accelerometer data processing perspective time spent in intensity levels is often simplified to time spent below, above, or between certain acceleration level(s) chosen to make a crude separation of the intensity levels. The acceleration magnitude(s) to use as threshold(s), is often referred to as cut-point in the literature. -Time spent below, above, or between the cut-point(s) is intended to be a crude indicator of time spent in behaviours and sufficient to rank individuals on their amount of time spent in these behaviours. The simple threshold approach has indisputably been the most powerful method so far to drive physical activity research. +As discussed in more detail further down, the acceleration (intensity) level classification is done in GGIR parts 2 and 5. -In GGIR part 2, only MVPA is estimate. The threshold(s) for MVPA as used in GGIR part 2 are set with parameter `mvpathreshold`. You can specify a single value or a vector of multiple values, time spent in MVPA will then be derived with each of them. However, the threshold is not the only parameters to influence time spent in MVPA, as we will discussed in the following paragraphs. - -## Role of epoch length +## Epoch length Although accelerometers collect data at much higher sampling frequency, we only work with aggregated values (e.g. 1 or 5 second epochs) for the following reasons: -1. Accelerometers are often used to describe patterns in metabolic energy expenditure. Metabolic energy expenditure is typically defined per breath or per minute (indirect calorimetry), per day (room calorimeter), or per multiple days (doubly labelled water method). In order to evaluate our methods against these reference standards, we need to work with a similar time resolution. +1. Accelerometers are often used to describe patterns in metabolic energy expenditure. Metabolic energy expenditure is typically defined per breath or per minute (indirect calorimetry), per day (room calorimeter), or per multiple days (doubly labelled water method). In order to evaluate our methods against these reference standards, we need to work with a similar time resolution. -2. Collapsing the data to epoch summary measures helps to standardise our output across data collected with different sampling frequencies between studies. +2. Collapsing the data to epoch summary measures helps to standardise our output across data collected with different sampling frequencies between studies. -3. There is little evidence that the raw data is an accurate representation of body acceleration. All scientific evidence on the validity of accelerometer data has so far been based on epoch aggregates. +3. There is little evidence that the raw data is an accurate representation of body acceleration. All scientific evidence on the validity of accelerometer data has so far been based on epoch aggregates. Short epoch lengths, such as 1 or 5 seconds, are more sensitive to sporadic behaviours and often combined with bout detection to identify MVPA only as a sustained behaviour. Longer epochs, such as 30 or 60 seconds, do not have this problem and are therefore easier to use without bout detection. -The epoch length in GGIR is by default 5 seconds, and can be set as the first value of the vector specified by parameter `windowsizes`. Although we discuss epoch length here in the context of MVPA, please note that epoch length influences many of the outcomes by GGIR. +The epoch length in GGIR is by default 5 seconds, and can be set as the first value of the vector specified by parameter `windowsizes`. Although we discuss epoch length here in the context of MVPA, please note that epoch length influences many of the outcomes by GGIR such as sleep analysis. GGIR part 5 offers the option to aggregate the time series to a 1 minute epoch length in order to do physical activity research at 1 minute resolution while leaving the sleep detection that relies on a shorter epoch length untouched (see parameter `part5_agg2_60seconds=TRUE`). ## Bout detection -A bout is a time segment that meets specific temporal criteria and has frequently been used in the physical activity research. GGIR facilitates processing data both with and without accounting for bouts. The motivation to look for bouts can be one of the following: +Behavioural bouts have been used frequently and are defined as sustained time spent ina behavioural class while adhering to specific temporal criteria. GGIR facilitates processing data both with and without accounting for bouts. The motivation to look for bouts can be one of the following: + +- With the idea that only behaviour with a certain minimum duration contributes to certain physiological benefits. -- With the idea that only behaviour with a certain minimum duration contributes to certain physiological benefits. +- To make the classification of behaviour consistent with self-report data, only sensitive to duration of specific duration. -- To make the classification of behaviour consistent with self-report data, only sensitive to duration of specific duration. +- To aid studying the fragmentation of behaviour. -- To aid studying the fragmentation of behaviour. +- To account for the sporadic nature of behaviour when working with short epochs. -To define a bout we need to answer series of question: +To define a bout we need to answer a series of questions: -1. What should the cut-point be? -2. What should the epoch length be? -3. What should minimum duration of bout be? -4. Should we allow for gaps in a bout as in breaks in the behaviour of interest? -5. If yes to 4, should this be a percentage of the bout duration, an absolute minimum in seconds, or a combination of both? -6. If yes to 4, are bout gaps counted towards the time spent in bouts? -7. Do the first and last epoch need to meet the threshold criteria? -8. In what order are the bouts extracted? For example, if a short MVPA bout is part of a longer Inactivity bout which of the two prevails? -9. How many bout categories should there be? +1. What should the cut-point be? +2. What should the epoch length be? +3. What should minimum duration of bout be? +4. Should we allow for gaps in a bout as in breaks in the behaviour of interest? +5. If yes to 4, should this be a percentage of the bout duration, an absolute minimum in seconds, or a combination of both? +6. If yes to 4, are bout gaps counted towards the time spent in bouts? +7. Do the first and last epoch need to meet the threshold criteria? +8. In what order are the bouts extracted? For example, if a short MVPA bout is part of a longer Inactivity bout which of the two prevails? +9. How many bout categories should there be? GGIR facilitates the following freedom in bout detection: User decides on: -- Acceleration thresholds for light, moderate, and vigorous intensity +- Acceleration thresholds for light, moderate, and vigorous intensity with `mvpathreshold` for part 2, `threshold.lig`, `threshold.mod`, and `threshold.vig` for part 5. -- Fraction of time for which cut-point criteria need to be met (light, inactive, MVPA) +- Fraction of time for which cut-point criteria need to be met (light, inactive, MVPA) with `boutcriter` for part 2 and `boutcriter.lig`, `boutcriter.mod`, and `boutcriter.vig` for part 5. -- Bout duration range. For example, a simple scenario could be to consider all bouts of a minimum length of 10 minutes, while it is also possible to subdivide them in bouts lasting [1, 5) [5, 10) and [10, ∞) minutes. +- Bout duration range. In part 2 with `mvpadur` and in part 5 with `boutdur.lig`, `boutdur.mod`, and `boutdur.mvpa`. This functionality is slightly different between part 2 and part 5 as discussed further down. -- Epoch length. +- Epoch length with `windowsizes` and `part5_agg2_60second`. User does NOT decide on: -- Maximum bout gap of 1 minute, if the fraction of time for which the cut-point criteria need to be met is less than 100% - -- First and last epoch need to meet cut-point criteria. - -- Number of intensity levels, which are always: inactive, light and MVPA. - -- Order in which bouts are calculated (1 MVPA; 2 inactive; 3 Light) - -## Differences in physical activity estimates in part 2 and 5 - -The parameters needed for MVPA estimates in GGIR part 2 are different from the parameters used for estimating MVPA, LIPA and Inactivity in part 5. - -GGIR part 2 always provides six distinct approaches to MVPA calculation that are controlled with parameters `mvpathreshold`, `boutcriter`, `mvpadur`, and the first element of vector `windowsdur`. Here, MVPA provides time spent in MVPA based on: - -- 5 second, 1 minute or 5 minute epochs and no bouts - -- 5 second epochs and 3 different minimum bout duration as specified with parameter `mvpadur`. - -The bout durations are each used for separate estimates and not used complimentary to each other as is the case in part 5. For example, specifying `boutdur.mod = c(5, 10)` in part 5 will result in an estimate of time spent in bouts lasting from 5 till 10 minutes and in bouts lasting 10 minutes and longer. - -# Controlling the time window of analysis: - -As discussed in chapter 7, it is possible to tell both GGIR part 2 and part 5 to extract variables per segment of the day. We do this with parameter `qwindow` for which you can find a detailed discussion in Annex Day segment analysis. - -# Behavioural fragmentation +- Maximum bout gap of 1 minute, if the fraction of time for which the cut-point criteria need to be met is less than 100% -In addition to classifying the time spent in each behavioural class it is also informative to study the fragmentation of behaviours. +- First and last epoch need to meet cut-point criteria. -## Classes of behaviour used to study fragmentation +- Number of intensity levels, which are always: inactive, light and MVPA. -In GGIR, a fragment for daytime is a defined as a sequence of epochs that belong to one of the four categories: +- Order in which bouts are calculated (1 MVPA; 2 inactive; 3 Light) -1. Inactivity -2. Light Physical Activity (LIPA) -3. Moderate or Vigorous Physical Acitivty (MVPA) -4. Physical activity (can be either LIPA or MVPA) +## Controlling the time window of analysis -Each of these categories represents the combination of bouted and unbouted time in the respective categories. Inactivity and physical activity add up to a full day (outside SPT), as well as inactivity, LIPA and MVPA. The fragmentation metrics are applied in function `g.fragmentation`. +As discussed in chapter 7, it is possible to tell both GGIR part 2 and part 5 to extract variables per segment of the day. We do this with parameter `qwindow` for which you can find a detailed discussion in Annex [Day segment analysis](https://wadpac.github.io/GGIR/articles/TutorialDaySegmentAnalyses.html). -A fragment of SPT is defined as a sequence of epochs that belong to one of the four categories: +## Key parameters -1. Estimated sleep -2. Estimated wakefulness -3. Inactivity -4. Physical activity (can be either LIPA or MVPA) - -Note that from the metrics below only fragmentation metrics `TP` and `NFrag` are calculated for the SPT fragments. - -With parameter `frag.metrics = "all"` we can tell GGIR part 5 to derive behavioural fragmentation metrics for daytime and (separately) for spt. You may want to consider combining this with parameter `part5_agg2_60seconds=TRUE` as that will aggregate the time series to 1 minute resolution as is common in behavioural fragmentation literature. - -## Fragmentation metrics - -- Coefficient of Variance (`CoV`) is calculated according to [Blikman et al. 2014](https://doi.org/10.1016/j.apmr.2014.08.023), which entails dividing the standard deviation by the mean lognormal transformed fragment length (minutes). - -- Transition probability (`TP`) from Inactivity (IN) to Physical activity (IN2PA), from Physical activity to inactivity (PA2IN), and from IN to LIPA or MVPA are all calculated according to Danilevicz et al. 2023 10.21203/rs.3.rs-3543711/v1. - -- Gini index is calculated with function `Gini` from the `ineq` R package, and with `ineq` argument `corr` set to TRUE. - -- Power law exponent metrics: Alpha, x0.5, and W0.5 are calculated according to [Chastin et al. 2010](https://doi.org/10.1016/j.gaitpost.2009.09.002). - -- Number of fragment per minutes (`NFragPM`) is calculated identical to metric `fragmentation index` in [Chastin et al. 2012](https://academic.oup.com/ageing/article/41/1/111/46538), but it is renamed here to be a more specific reflection of the calculation. The term `fragmentation index` appears too generic given that all fragmentation metrics inform us about fragmentation. Please note that this is close to the metrics for transition probability, because total number divided by total sum in duration equals 1 divided by average duration. Although the exact math is slightly different. - -### Conditions for calculation and value when condition is not met - -- Metrics `Gini` and `CoV` are only calculated if there are at least 10 fragments (e.g. 5 inactive and 5 active). If this condition is not met the metric value will be set to missing. - -- Metrics related to power law exponent alpha are also only calculated when there are at least 10 fragments, but with the additional condition that the standard deviation in fragment duration is not zero. If these conditions are not met the metric value will be set to missing. - -- Other metrics related to binary fragmentation (`mean_dur_PA` and `mean_dur_IN`), are calculated when there are at least 2 fragments (1 inactive, 1 active). If this condition is not met the value will is set to zero. - -- Metrics related to `TP` are calculated if: There is at least 1 inactivity fragment AND (1 LIPA OR 1 MVPA fragment). If this condition is not met the `TP` metric value is set to zero. +The parameters needed for MVPA estimates in GGIR part 2 are different from the parameters used for estimating MVPA, LIPA and Inactivity in part 5. -To keep an overview of which recording days met the criteria for non-zero standard deviation and at least ten fragments, GGIR part 5 stores variable `Nvaliddays_AL10F` at person level (i.e., number of valid days with at least 10 fragments), and `SD_dur` (i.e., standard deviation of fragment durations) at day level as well as aggregated per person. +### Physical activity cut-point parameter GGIR part 2 -- GGIR derives fragmentation metrics per waking hours of the day in part 5 and per recording in part 6. Calculation per day allows us to explore and possibly account for behavioural differences between days of the week. However, for rare behaviours a day level estimate could be considered less robust than the recording level estimates as generated in part 6. +In GGIR part 2, only MVPA is estimated since sleep has not been classified at that point. The threshold(s) for MVPA as used in GGIR part 2 are set with parameter `mvpathreshold`. You can specify a single value or a vector of multiple values, time spent in MVPA will then be derived with each of them. -### Differences with R package ActFrag: +GGIR part 2 always provides six distinct approaches to MVPA calculation that are controlled with parameters `mvpathreshold`, `boutcriter`, `mvpadur`, and the first element of vector `windowsizes`. Here, MVPA provides time spent in MVPA based on: -The fragmentation functionality was initially inspired on the great work done by Dr. Junrui Di and colleagues in R package ActFrag, as described in [Junrui Di et al. 2017](https://www.biorxiv.org/content/10.1101/182337v1). However, we made a couple of a different decisions that may affect comparability: +- 5 second, 1 minute or 5 minute epochs and no bouts -- Transition probability is according to Danilevicz et al. 2023 10.21203/rs.3.rs-3543711/v1 +- 5 second epochs and 3 different minimum bout duration as specified with parameter `mvpadur`. -- Power law alpha exponent metrics were calculated according to [Chastin et al. - 2010](https://doi.org/10.1016/j.gaitpost.2009.09.002) using the theoretical minimum fragment duration instead of the observed minimum fragment duration. +### Physical activity cut-point parameter GGIR part 5 -## Key arguments +The bout durations are each used for separate estimates and not used complimentary to each other as is the case in part 5. For example, specifying `boutdur.mod = c(5, 10)` in part 5 will result in an estimate of time spent in bouts lasting from 5 till 10 minutes and in bouts lasting 10 minutes and longer. For example, a simple scenario could be to consider all bouts of a minimum length of 10 minutes, while it is also possible to subdivide them in bouts lasting [1, 5) [5, 10) and [10, ∞) minutes. -The parameters related to cut-points and bout detection are all concentrated in the “Physical activity parameters” as discussed in https://cran.r-project.org/ package=GGIR/vignettes/GGIRParameters.html. +All parameters related to cut-points and bout detection are listed under ["Physical activity parameters"](https://wadpac.github.io/GGIR/articles/GGIRParameters.html#physical-activity-parameters). ## Related output In GGIR part 2 csv reports you will find: -- Time spent in MVPA +- Time spent in MVPA In GGIR part 5 csv reports you will find: -- Time spent in MVPA -- Time spent in LIPA -- Time spent in inactivity (abbreviated as IN) +- Time spent in MVPA +- Time spent in LIPA +- Time spent in inactivity (abbreviated as IN) -In chapter 7 we discussed the structure of the part 2 output and in chapter 11 will provide a more detailed discuss of all the part 5 output. Further, for a detailed discussion of specific all output variables in all parts see https://cran.r-project.org/web/packages/GGIR/vignettes/GGIR.html#4_Inspecting_the_results +In [chapter 7](https://wadpac.github.io/GGIR/articles/chapter7_DescribingDataWithoutKnowingSleep.html) we discussed the structure of the part 2 output. The next chapter ([chapter 12](https://wadpac.github.io/GGIR/articles/chapter12_TimeUseAnalysis.html)) will provide a more detailed discussion of all the part 5 output. For an overview of output variables see the [GGIR output annex](https://wadpac.github.io/GGIR/articles/GGIRoutput.html). diff --git a/vignettes/chapter13_CircadianRhythm.Rmd b/vignettes/chapter13_CircadianRhythm.Rmd index 44f751892..25031146b 100644 --- a/vignettes/chapter13_CircadianRhythm.Rmd +++ b/vignettes/chapter13_CircadianRhythm.Rmd @@ -16,52 +16,69 @@ knitr::opts_chunk$set( ## MXLX -Detection of the continuous least (LX) and most (MX) active X hours in a day, where X is defined by argument `winhr`. For both GGIR calculates the average acceleration, the start time, and if argument `iglevels` is specified also the intensity gradient. If argument `winhr` is a vector then descriptive values for LX and MX are derived per value in `winhr`. +MXLX looks for the continuous least (LX) and most (MX) active X hour window in a day, where X is defined by parameter `winhr`. For both LX and MX, GGIR calculates the average acceleration, the start time, and if argument `iglevels` is specified also the intensity gradient. If parameter `winhr` is a vector then MX and LX are derived for each value in the vector. -Within GGIR part 2 MXLX is calculated per calendar day and, if argument `qwindow` is specified, per segment of the day. Within GGIR part 5 MXLX is calculated per window, and if used in combination with the GENEActiv or Axivity accelerometer brand LUX estimates per LX and MX are included. +Within GGIR part 2 MXLX is calculated per calendar day and, if argument `qwindow` is specified, per segment of the day. Within GGIR part 5 MXLX is calculated per [window](https://wadpac.github.io/GGIR/articles/chapter12_TimeUseAnalysis.html#defining-the-time-windows). If used in combination with the GENEActiv or Axivity accelerometer brands, LUX estimates per LX and MX are also included in GGIR part 5 csv reports. The MX metric described here should not be confused by the MX metrics as proposed by [Rowlands et al.](https://doi.org/10.1186/s40798-019-0225-9) which looks at accumulated most active time which may not always be continuous in time. The MX metrics by Rowlands et al. are discussed [here](https://wadpac.github.io/GGIR/articles/chapter7_DescribingDataWithoutKnowingSleep.html#sets-of-quantiles-mx-metrics-by-rowlands-et-al-). -## Cosinor analysis and Extended Cosinor analysis +## (Extended) Cosinor analysis -**Disclaimer: This functionality is currently (2024) being revised. Once the enhancements have been incorporated in a GGIR release this section will also be updated.** +The Cosinor analysis quantifies the circadian 24 hour cycle. Cosinor analysis refers to fitting a cosine function to a log transformed time series, while the extended cosinor analysis refers to fitting a non-linear transformation of the traditional cosinor curve to after Marler et al. Statist. Med. 2006 (doi: 10.1002/sim.2466). -The (Extended) Cosinor analysis quantifies the circadian 24 hour cycle. Cosinor analysis refers to fiting a cosine function to a time series, while the extended cosinor analysis refers to fitting a cosine function to the transformed time series afrer Marler et al. Statist. Med. 2006 (doi: 10.1002/sim.2466). +Corinos analyssis are not run by default, to tell GGIR to perform these analyse specify parameter `cosinor = TRUE`. The implementation is as follows: -To do this GGIR uses R package -[ActCR](https://CRAN.R-project.org/package=ActCR) as a dependency. Specify argument `cosinor = TRUE` to perform these analysis. +1. The acceleration metric as specified with parameter `acc.metric` is used. +2. Acceleration metric values are averaged per minute and expressed in m*g* if the input is in *g*, and then log transformed as `log(acceleration + 1)`. +3. Invalid data points such as caused by non-wear are set to missing (`NA`) as we do not want the imputation used elsewhere in GGIR to influence the Cosinor analysis. We do this because imputation technique generally come with some assumptions about circadian rhythm. +4. In part 2 GGIR uses all valid data in the recording while in part 6 we only use the valid data in the interval as defined with parameter `part6Window`, e.g. first wake-up time till last wake-up time. +5. GGIR looks for the first valid data point in the time series and then selects the maximum integer number of recording days following this data point. +6. If Day Saving Time occurs in the time series then duplicated timestamps when clock moves backward are ignored and missing timestamps when clock moves forward are inserted as missing values. +7. Cosinor models are fitted using functions `ActCosinor` and `ActExtendCosinor` from R package [ActCR](https://CRAN.R-project.org/package=ActCR). Here, `ActExtendCosinor` uses an anti-logistic function for the transformation. +8. The time offset between the start of the time series as used and the following midnight is used to reverse offset the ActCR results, to ensure that acrophase and acrotime can be interpreted relative to midnight. +9. Time series corresponding to the fitted models are stored inside the part 2 milestone data as stored in output subfolder `meta/ms2.out` to facilitate visual inspection. For the moment they are not used in any GGIR visualisation, but you may want to look them up and try to plot them yourself. They are stored in object `SUM$cosinor_ts`. -GGIR performs cosinor analysis both in part 2 and 6. +## Intradaily Variability (IV) and Interdaily Stability (IS) -The implementation within GGIR part 2 is as follows: +IV and IS were first proposed by [Witting W et al. 1990](https://doi.org/10.1016/0006-3223(90)90523-5) and [van Someren EJ, et al. 1996](https://doi.org/10.1016/0006-3223(95)00370-3). -- Acceleration values are averaged per minute, and then log transformation as `log(acceleration converted to _mg_ + 1)`. -- Invalid data points such as caused by non-wear are set to missing (`NA`) in order to prevent the imputation approach used elsewhere in GGIR to influence the Cosinor analysis. We do this because imputation technique generally come with some assumptions about circadian rhythm. -- GGIR looks for the first valid data point in the recording and then selects the maximum integer number of recording days following this data point and feeds these to the ActCosinor and ActExtendCosinor functions of ActCR. The time offset between the start and the following midnight is then used to reverse offset the ActCR results, to ensure that acrophase and acrotime can be interpreted relative to midnight. -- In relation to Day Saving Time: Duplicated time stamps when clock moves backward are ignored and missing time stamps when clock moves forward are inserted as missing values. -- Time series corresponding to the fitted models are stored inside the part 2 milestone data to facilitate visual inspection. For the moment they are not used in any GGIR visualisation, but you may want to look them up and try to plot them yourself. - -The implementation within GGIR part 6 is as follows: -- Only the time series are used in the interval as defined with parameter `part6Window`. -- Make sure to set `save_ms5raw_without_invalid = FALSE`, after the revision it will also handle setting this to TRUE but for the moment TRUE introduces errors. - +- IS measures how constant is the routine of activity over several days and ranges from 0 to 1, values close to 1 indicate more constant routine. -## Intradaily Variability (IV) and Interdaily Stability (IS) +- IV measures the variability in activity hour by hour throughout the days. It ranges from 0 to +$\infty$, value close to 2 indicates more fragmented rhythm, and \>2 indicates ultradian rhythm (very uncommon). + +The GGIR implementation of IV and IS since GGIR release 3.1-6 has been described in [Danilevicz et al. 2024](https://doi.org/10.1186/s12874-024-02255-w). This implementation replaces the experimental implementation of IS and IV that was present in GGIR since release 1.5-1. In the experimental implementation we were not sure how to go from raw acceleration signal to an indicator of being active as this aspect was not documented in the original publications. Similarly, we were not sure how to deal with missing data. However, these issues were both resolved in release 3.1-6: + +- Being active is now defined as a mean acceleration metric value above the light physical activity threshold as specified with parameter `threshold.lig`. +- Missing values are left missing and not imputed, the algorithm now accounts for this. + +The new implementation as documented by [Danilevicz et al. 2024](https://doi.org/10.1186/s12874-024-02255-w) is not compatible with the older experimental implementation. Parameters `IVIS.activity.metric`, `IVIS_windowsize_minutes`, `IVIS_epochsize_seconds`, and `IVIS_acc_threshold` that were used before is no longer needed and have been deprecated. + +**Cosinor analysis compatible IV and IS** + +IS is sometimes used as a measure of behavioural robustness when conducting Cosinor analysis. However, to work with the combination of the two outcomes it seems important that IS is calculated from the same time series. Therefore, when `cosinor = TRUE,` IV and IS are calculated twice: Once as part of the default IV and IS analysis as discussed above, and once as part of the Cosinor analysis using the same log transformed time series. + +The Cosinor-compatible IV and IS estimates are stored as output variables `cosinorIV` and `cosinorIS`. + +## phi + +Phi indicates how correlated the multi-day acceleration time series is with itself when there is an hour shift, also known as first-order auto-correlation from the first-order autoregressive model AR(1). A higher phi value indicates a higher autocorrelation, while a phi close to zero or even negative indicates more fragmented behavior. For a detailed discussion of phi see [Dickey and Fuller (1979)](https://doi.org/10.1080/01621459.1979.10482531) and [Danilevicz et al. 2024](https://doi.org/10.1186/s12874-024-02255-w). Phi is calculated by default in GGIR part 2 and in part 6 only when parameter `part6CR` is set to TRUE. + +## Detrended fluctionation analysis (DFA) + +### Self-similarity paramerter (SSP) + +The self-similarity paramter (SSP) is also known as scaling exponent or alpha. SSP is a real number between zero and two. Values in the range (0, 1) indicate stationary motion behaviour. Values int he range (1, 2 indicate nonstationary motion behaviour. For details see [Mesquita et al 2020](https://doi.org/10.1093/bioinformatics/btaa955) and [Danilevicz et al. 2024](https://doi.org/10.1186/s12874-024-02255-w). -**Disclaimer: This functionality is currently (2024) being revised. Once the enhancements have been incorporated in a GGIR release this section will also be updated.** +### Activity Balance Index (ABI) -### IV and IS - Default +The Activity Balance Index (ABI) was introduced by [Danilevicz et al. 2024](https://doi.org/10.1186/s12874-024-02255-w) and is a transformation of SSP. ABI measures how the activity over the observed period is balanced, higher values reflect a more balanced pattern of activity. ABI is a real number between zero and one and calculated from the acceleration metric time series directly without the need for cut-points. -The original implementation (argument `IVIS.activity.metric = 1`) uses the continuous numeric acceleration values. However, as we later realised, this is not compatible with the original approach by van Someren and colleagues, which uses a binary distinction between active and inactive. Therefore, a second option was added (argument `IVIS.activity.metric = 2`), which needs to be used in combination with accelerometer metric ENMO, and collapses the acceleration values into a binary score of rest versus active. This is the current default. +## Related output -### IV and IS - Cosinor analysis compatible +- MXLX are derived per 24 hours (part 2) and available per day in part2_daysummary.csv, and summarised per recording in part2_summary.csv. -IS is sometimes used as a measure of behavioural robustness when conducting Cosinor analysis. However, to work with the combination of the two outcomes it seems important that IS is calculated from the same time series. Therefore, when `cosinor = TRUE` IV and IS are calculated twice: Once as part of the default IV and IS analysis as discussed -above, and once as part of the Cosinor analysis using the same log transformed time series. More specifically, the IV and IS algorithm is applied with `IVIS.activity.metric = 2` and a threshold `IVIS_acc_threshold = log(20 + 1)` to make the binary distinction between active and inactive, and `IVIS_per_daypair = TRUE`. The setting `IVIS_per_daypair` was specifically designed for this context to handle the potentially missing values in the time series as used for Cosinor -analysis. Applying the default IVIS algorithm would not be able to handle the missing values and would result in a loss of information if all non-matching epochs across the entire recording were excluded. Instead, IV and IS are calculated as follows: +- MXLX are derived per window (part6) but only stored as recording summary in part6_summary.csv. -1. Per day pair based on matching valid epochs only IV and IS and calculated. Here, a log is kept of the number of valid epochs per day pair. -2. Omit day pairs where the fraction of valid epoch pairs is below 0.66 (0.66 is hard-coded at the moment). -3. Calculate average IS across days weighted by fraction of valid epochs per day pairs. +- All other circadian rhythm variables are only derived at recording level. IV, IS, phi, and cosinor analysis variables and stored in part2_summary.csv and part6_summary.csv, while SSP and ABI are only stored in part6_summary.csv. -The new Cosinor-compatible IV and IS estimates are stored as output variables `cosinorIV` and `cosinorIS`. +- For a more detailed variable dictionary see annex on [GGIR output](https://wadpac.github.io/GGIR/articles/GGIRoutput.html). diff --git a/vignettes/chapter14_BehaviouralFragmentation.Rmd b/vignettes/chapter14_BehaviouralFragmentation.Rmd new file mode 100644 index 000000000..9c8394d70 --- /dev/null +++ b/vignettes/chapter14_BehaviouralFragmentation.Rmd @@ -0,0 +1,88 @@ +--- +title: "14. Behavioural fragmentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{14. Behavioural fragmentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r, include = FALSE} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) +``` + +In chapters 8, 9, and 10 we discussed the classification of sleep and in chapter 11 we discussed the classification of daytime physical behavioural classes. These are typically reported as time spent per behavioural class. An complementary way of describing the data is by looking at the fragmentation of these behaviours of time. + +## Defining fragments + +In GGIR, a fragment for daytime is a defined as a sequence of epochs that belong to one of the four categories: + +1. Inactivity +2. Light Physical Activity (LIPA) +3. Moderate or Vigorous Physical Acitivty (MVPA) +4. Physical activity (can be either LIPA or MVPA) + +Each of these categories represents the combination of bouted and unbouted time in the respective categories. Inactivity and physical activity add up to a full day (outside SPT), as well as inactivity, LIPA and MVPA. + +A fragment of SPT is defined as a sequence of epochs that belong to one of the four categories: + +1. Estimated sleep +2. Estimated wakefulness +3. Inactivity +4. Physical activity (can be either LIPA or MVPA) + +With parameter `frag.metrics = "all"` we can instruct GGIR part 5 to derive behavioural fragmentation metrics. You may want to consider combining this with parameter `part5_agg2_60seconds=TRUE` as that will aggregate the time series to 1 minute resolution as is common in behavioural fragmentation literature. GGIR part 6 performs fragmentation analysis when `part6CR` is set to `TRUE`. For this it uses the time series output generated in part 5 as discussed in the [previous chapter](https://wadpac.github.io/GGIR/articles/chapter12_TimeUseAnalysis.html#exporting-time-series). + +GGIR derives fragmentation metrics in two ways: + +- In part 5 fragmentation is quantified per waking hours of the day and reported per day and as recording average of the daily estimates. +- In part 6 fragmentation is quantified based on all data in the recording within the window as specifed by parameter `part6Window`. + +Calculation per day allows us to explore and possibly account for behavioural differences between days of the week. However, a day level estimate could be considered less robust than the recording level estimates as generated in part 6. + +The in internal function `g.fragmentation` for fragmentation metric calculation is used in both part 5 and 6 ensuring that the calculation are otherwise consistent. + +## Fragmentation metrics + +Note that from the fragmentation metrics discussed below only fragmentation metrics `TP` and `NFrag` are calculated for the SPT fragments. + +- Coefficient of Variance (`CoV`) is calculated according to [Blikman et al. 2014](https://doi.org/10.1016/j.apmr.2014.08.023), which entails dividing the standard deviation by the mean lognormal transformed fragment length (minutes). + +- Transition probability (`TP`) from Inactivity (IN) to Physical activity (IN2PA), from Physical activity to inactivity (PA2IN), and from IN to LIPA or MVPA are all calculated according to [Danilevicz et al. 2024](https://doi.org/10.1186/s12874-024-02255-w). + +- Gini index is calculated with function `Gini` from the `ineq` R package, and with `ineq` argument `corr` set to TRUE. + +- Power law exponent metrics: Alpha, x0.5, and W0.5 are calculated according to [Chastin et al. 2010](https://doi.org/10.1016/j.gaitpost.2009.09.002). Note that compared with R package ActFrag as described in [Junrui Di et al. 2017](https://www.biorxiv.org/content/10.1101/182337v1) we we use the theoretical minimum fragment duration instead of the observed minimum fragment duration. + +- Number of fragment per minutes (`NFragPM`) is calculated identical to metric `fragmentation index` in [Chastin et al. 2012](https://academic.oup.com/ageing/article/41/1/111/46538), but it is renamed here to be a more specific reflection of the calculation. The term `fragmentation index` appears too generic given that all fragmentation metrics inform us about fragmentation. Please note that this is close to the metrics for transition probability, because total number divided by total sum in duration equals 1 divided by average duration. Although the exact math is slightly different. + +## Conditions for calculation + +- Metrics `Gini` and `CoV` are only calculated if there are at least 10 fragments (e.g. 5 inactive and 5 active). If this condition is not met the metric value will be set to missing. + +- Metrics related to power law exponent alpha are also only calculated when there are at least 10 fragments, but with the additional condition that the standard deviation in fragment duration is not zero. If these conditions are not met the metric value will be set to missing. + +- Other metrics related to binary fragmentation (`mean_dur_PA` and `mean_dur_IN`), are calculated when there are at least 2 fragments (1 inactive, 1 active). If this condition is not met the value will is set to zero. + +- Metrics related to `TP` are calculated if: There is at least 1 inactivity fragment AND (1 LIPA OR 1 MVPA fragment). If this condition is not met the `TP` metric value is set to zero. + +To keep an overview of which recording days met the criteria for non-zero standard deviation and at least ten fragments, GGIR part 5 stores variable `Nvaliddays_AL10F` at person level (i.e., number of valid days with at least 10 fragments), and `SD_dur` (i.e., standard deviation of fragment durations) at day level as well as aggregated per person. + +## Key parameters + +The parameters related to cut-points and bout detection are mainly the parameters listed under ["Physical activity parameters"](https://wadpac.github.io/GGIR/articles/GGIRParameters.html#physical-activity-parameters). + +## Related output + +In GGIR part 5 csv reports you will find: + +- Fragmentation metrics at day level per waking hours of the day + +In GGIR part 6 csv report you will find: + +- Fragmentation metrics + +For an overview of output variables see the [GGIR output annex](https://wadpac.github.io/GGIR/articles/GGIRoutput.html). diff --git a/vignettes/chapter1_WhatIsGGIR.Rmd b/vignettes/chapter1_WhatIsGGIR.Rmd index 6207e86fd..6b0310030 100644 --- a/vignettes/chapter1_WhatIsGGIR.Rmd +++ b/vignettes/chapter1_WhatIsGGIR.Rmd @@ -65,18 +65,18 @@ Further, we hope GGIR is of use to those without the financial resources for com ### Algorithm design -The philosophy behind the algorithms as implemented in GGIR is that biomechanical explainable (heuristic or knowledge driven) approaches to measurement in science are preferable over purely data-driven approaches. Only when a knowledge driven approach is unrealistic we can consider a data-driven approach. +The philosophy behind the algorithms as implemented in GGIR is that biomechanical explainable (heuristic or knowledge driven) approaches to measurement in science are preferable over purely data-driven approaches. Please note the specification to the scientific context rather than measurement in general, e.g. consumer wearables. Only when a knowledge driven approach is unrealistic we can consider a data-driven approach. -The idea of a knowledge driven approach is that in order to advance insight in our field of research, it is essential to have an understanding of the causal relation between the phenomena being observed (e.g. acceleration of one body part), the way the (acceleration) sensor works, what we do with the data produced, and how we interpret the data. For example, we know that body acceleration relates to energy expenditure because of physics and human physiology. The abundance of scientific publications that have reported a positive correlation between accelerometer data and energy expenditure only served to confirm that existing knowledge was correct. +The idea of a knowledge driven approach is that in order to advance insight, it is essential to have an understanding of the causal relation between the phenomena being observed (e.g. acceleration of one body part), the way the (acceleration) sensor works, what we do with the data produced, and how we interpret the data. For example, we know that body acceleration relates to energy expenditure because of physics and human physiology. The abundance of scientific publications that have reported a positive correlation between accelerometer data and energy expenditure only served to confirm prior knowledge. -In contrast, data-driven methods focus on optimal correlation between sensor data and reference labels or values, and are much less concerned with causal associations that are the focus of knowledge driven approaches, as defined above. Identical to how correlation is not necessarily equal to causation in health research, the process of measurement can also be confounded. Some examples: We may see differences in body acceleration patterns that correlate with different activity types or different levels of energy expenditure, but that does not mean that we actually measure those activity types or energy expenditure levels. Ignoring such aspects can easily lead to overestimating the value of an accelerometer for measuring those constructs (activity type, etc) and to underestimate the value of an accelerometer of capturing acceleration as a useful measure of behaviour, if appropriately used and interpreted. +In contrast, data-driven methods focus on optimal correlation between sensor data and reference labels or values, and are much less concerned with causal associations that are the focus of knowledge driven approaches, as defined above. Identical to how correlation is not necessarily equal to causation in health research, the process of measurement can also be confounded. Some examples: We may see differences in body acceleration patterns that correlate with different activity types or different levels of energy expenditure, but that does not mean that we actually measure those activity types or energy expenditure levels. Ignoring this distinction can easily lead to overestimating the value of an accelerometer for measuring those constructs (activity type, etc) and to underestimate the value of an accelerometer of capturing acceleration as a useful measure of behaviour, if appropriately used and interpreted. A second problem with data-driven methods is that they heavily depend on the availability of reliable criterion methods. We argue that such reliable criterion methods do not exist for physical behaviour measurement: 1. Indirect calorimetry and the indicators of energy metabolism that can be derived from it are unable to account for the activity type specific role of body weight on energy metabolism. This makes it impossible to make a standardised comparison of the energy cost of different activity types across individuals that differ in body weight. See also reflections in [this blog post](https://www.accelting.com/updates/why-does-ggir-facilitate-cut-points/). -2. Polysomnography (PSG) is the standard in sleep research. PSG offers a physiological definition of sleep that is impossible to capture with a movement sensor. Therefore, we are forced to simplify our definition of ‘sleep’ towards a definition that can be captured by a movement sensor. As a result, the act of evaluating an accelerometer on its ability to classify sleep with PSG becomes somewhat meaningless as we already know that we are not measuring the same construct as PSG. +2. Polysomnography (PSG) is the standard in sleep research. PSG offers a physiological definition of sleep that is impossible to capture directly with a movement sensor. Therefore, we are forced to simplify our definition of ‘sleep’ towards a definition that can be captured by a movement sensor. As a result, the act of evaluating an accelerometer on its ability to classify sleep with PSG becomes somewhat meaningless as we already know that we are not measuring the same construct as PSG. 3. Activity types are ambiguous to define given the high number of ways they can be performed. This introduces a fundamental level of uncertainty about the robustness of models outside the datasets and context they were developed in. As a result, it is essential to put strong emphasis on algorithms that have descriptive value on their own regardless of whether they offer a high correlation with supposed criterion methods. @@ -99,5 +99,6 @@ Everything you need to type in your R script is `highlighted like this`. This documentation is not intended as an academic review: We only cite publications to clarify the origin of algorithms and we only discuss what is part of GGIR. + Finally, the first version of this documentation was sponsored by Accelting with the commitment that this will remain available as free open-access documentation. -However, things like this are much easier to maintain as a community: We would be grateful for your help to improve the documentation either by giving feedback, pull requests (for those who know how to do it), or financially. +However, open documentation is much easier to maintain as a community: We would be grateful for your help to improve the documentation either by giving feedback (e.g. via v.vanhees at accelting dot com), pull requests (for those who know how to do it), or financially. For example, it would be great if we had funding for creating high quality complementary info graphics and videos. diff --git a/vignettes/chapter7_DescribingDataWithoutKnowingSleep.Rmd b/vignettes/chapter7_DescribingDataWithoutKnowingSleep.Rmd index 5fe9783a5..161df210a 100644 --- a/vignettes/chapter7_DescribingDataWithoutKnowingSleep.Rmd +++ b/vignettes/chapter7_DescribingDataWithoutKnowingSleep.Rmd @@ -16,7 +16,7 @@ knitr::opts_chunk$set( With the metric values imputed in the previous chapter, GGIR part 2 offers a first descriptive analysis of the data. Although we will have to wait till later GGIR parts (chapters) to see the segmentation of days in waking and sleep hours, there is already enough that can be quantified at this point. -In this chapter we focus on descriptives of the data that are informative even without the knowledge on when the participant was sleeping or awake. +In this chapter we focus on descriptive analysis of the data that are informative even without the knowledge on when the participant was sleeping or awake. ## Data quality indicators @@ -29,7 +29,7 @@ Descriptive variables are calculated and reported for valid days only, where the ### Average acceleration -Average acceleration is known to be correlated with the activity-related energy expenditure. +Average acceleration is known to be correlated with the total activity-related energy expenditure. GGIR part 2 provide two types of average acceleration: - Average per day, only stored when the day was considered valid. @@ -39,9 +39,9 @@ GGIR part 2 provide two types of average acceleration: ### Acceleration distribution -To get a more detailed description of the data, looking at the distribution of acceleration values can be informative. GGIR facilitates this in two ways: +The distribution of acceleration values can be informative too. GGIR facilitates this in two ways: -1. By specifying the quantiles of the distribution with parameter `qlevels`, which are fed to the build-in R function `quantile`, GGIR gives us the metric values corresponding to such quantiles (a quantile multiplied by 100 is the same as a percentile). +1. By specifying the quantiles of the distribution with parameter `qlevels`, which are fed to the base R function `quantile`, GGIR gives us the acceleration metric values corresponding to such quantiles (a quantile multiplied by 100 is the same as a percentile). 2. By describing the time spent in acceleration ranges, which are defined by parameter `ilevels` . @@ -58,17 +58,17 @@ M10, as used in circadian rhythm research that also can be derived with GGIR (se To use the MX metrics as proposed by Rowlands et al., specify the durations of the 24h day during which you want to identify the accelerations values. For example, to generate the minimum acceleration value for the most active accumulated 30 minutes, you can call `qlevels = (1410/1440)`. -This parameter also accepts nested terms to generate multiple MX metrics. For example, to call M60, M30, and M10, you can specify the following: +This parameter also accepts a vector to generate multiple MX metrics. For example, to call M60, M30, and M10, you can specify the following: `qlevels = c(c(1380/1440), c(1410/1440), c(1430/1440))`. Note: If time segments shorter than 24 hours are specified in parameter `qwindow`, such as the 8-hour school day (as described in [Fairclough et al 2020](https://pubmed.ncbi.nlm.nih.gov/31593604/)), the denominator in `qlevels` should change from 1440 (24h) to the specific segment length. In this example, we would use 480 (8h). Accordingly, the argument to call M60, M30, and M10 would be: -`qlevels = c(c(1380/1440), c(1410/1440), c(1430/1440))`. +`qlevels = c(c(420/480), c(450/480), c(470/480))`. At the moment, this only works for one segment length and GGIR does not facilitate the generation of MX metrics for multiple unequal time segments within the same GGIR function call. -The output in the part 2 summary files will refer to this as a percentile of the day. +The output in the part 2 summary report file will refer to this as a percentile of the day. Thus, for a 24-h day, M30 will appear as “p97.91666_ENMO_mg_0.24hr”. To create the radar plots of these MX metrics as first described by [Rowlands et al.](https://pubmed.ncbi.nlm.nih.gov/31808014/), this [GitHub repository](https://github.com/Maylor8/RadarPlotGenerator) provides the R code and detailed instructions on how to make the radar plots using your own data.