-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
function estimate_oc #38
Comments
…xtra columns with the estimated oc and the model used to estimate each sample, to a list with tho elements. The data frame that was the former output and a aditional object with the parameters of the models. #38
@costavale |
Craft et al 1991 function already changed for Maxwell et al 2023 |
updated tasks: -check function with @costavale data. The function does work if the oc column is empty. This was not the problem. -Change 10 sample limit for a warning (see issue #44 ). Right now the function does not fit models if there are less than 10 samples. We can allow it and give a warning. Not all models will be used (models with low r2 or p value are deleted). Need to save which models had less than 10 samples and check with final database if any of this model were used and provide the names of the cores in the warning. -Check if the om values used to estimate oc are within the range of the samples used to fit the models (see issue #44 ). Possible approach: add the minimum and maximum input om value of each model in a df and then check if the values used to estimate the oc in each core are between them. -write more test (only a few to check that the function checks type of data provided (numeric vs not numeric), but not about the funtionality of the function itself -check documentation enhancement 1: the function divide the df by ecossytem, dominant species and station. Due to this, it is compulsory to provide three columns with ecosystem, species and station. The user may not have diferent species or stations. Allow user to chose if it want this groups or not. Ecosystem column need to be there anyway just in case the models are no good and need to use published functions. |
… to 3 #38 Also added a warning if the model used had less than 10 samples and if the oc estimated is below or abouve the initial oc range to estimate the mode. Second and third task in issue #38. Seeing the warning is dificult because the warning for mangrove data with less than 5% of om prints for every sample insted of ones. I will try to fix this or add it to the issue tasks.
updated tasks: -check function with @costavale data. The function does work if the oc column is empty. This was not the problem. -write more test (only a few to check that the function checks type of data provided (numeric vs not numeric), but not about the funtionality of the function itself -check documentation enhancement 1: the function divide the df by ecossytem, dominant species and station. Due to this, it is compulsory to provide three columns with ecosystem, species and station. The user may not have diferent species or stations. Allow user to chose if it want this groups or not. Ecosystem column need to be there anyway just in case the models are no good and need to use published functions. |
I have solved three bugs I found while using this function with new data: fit_models would not fit site models if there was only one specie per station I have already fixed all three. Maybe those were the one @costavale found. |
Hi all, I am going to check these updates and test again the functions with that dataset. |
Yes sorry, I'm a little lost right now. Nerea, if you think I can help you with a specific job let me know. I think I'll be more helpful if the issues you assign me are code related - if you have any |
This function is huge and writing test that would test specific steps is a bit messi. I have decided to write test for each of the small functions within it. what do you think @Pakillo ? Maybe I should put all test in the same script insted of opening a new script for each function? |
The function crashes |
choose model function tested if the species of that sample was included in the species model, but didnt give and alternative if not. is fixed, now it skip the species model and select ecosystem model :) #38
Estimation of organic Carbon is done by means of linear regressions on log(organic carbon) ~ log(organic matter). It gives back a organic carbon value for each organic matter value provided. If there is a organic carbon value for that sample it return the same value, else, generates a model for that site, else, model for specie, else, model for Ecosystem. If a model can not be created due to the low number of samples (<10) it uses the equations in Howard et al. 2014 to estimate the organic carbon.
outputs:
The initial tibble or data.frame with three new columns:
In addition, a plot with the relationship between organic matter and estimated organic carbon
Works with example data but there is at least one bug.
tasks:
-the function does not work if the oc column is empty. The function estimate several models and then check p values and r2 to see if they are good to use. If there are not good models utilize published equations to estimate oc. But if the oc column is empty it can not estimate any model and it breaks. In this case it should go directly to to the published equations but it does not.
-change equation to estimate oc in salt marsh data if models can not be built. Now it comes from Craft et al. 1991. New equation in Maxwell et al. (2023). https://doi.org/10.1038/s41597-023-02633-x
-add a new output: df with the parameters of the models estimated
-write more test (only a few to check that the function checks type of data provided (numeric vs not numeric), but not about the funtionality of the function itself
-check documentation
enhancement 1: the function divide the df by ecossytem, dominant species and station. Due to this, it is compulsory to provide three columns with ecosystem, species and station. The user may not have diferent species or stations. Allow user to chose if it want this groups or not. Ecosystem column need to be there anyway just in case the models are no good and need to use published functions.
The text was updated successfully, but these errors were encountered: