-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project 4: Develop an interactive application to help understand alpha and beta diversity metrics choices #4
Comments
Sooo I may have found someone's solution to my proposed project called MultiQC (GitHub link). It was published just over a year ago and is even more robust and has more functionality than just for my 16S rRNA use case. A quick Biostars/Google search could have saved me time 😅 @abaghela if you allow me, I have another proposition for a project I could lead that is specific to microbiome analysis. Let me know if you have any concerns with this new proposed project or not. Thanks. Title: Develop an interactive application to help understand alpha and beta diversity metrics choices Problem: There are many alpha and beta diversity metrics to analyze microbial ecological or microbiome data. Alpha diversity describes an estimate of the total number of species in a sample. Beta diversity describes the differences between samples. Below are some example of then number of metrics you can use. Plot from "Alpha diversity graphics" page for phyloseq showing various alpha diversity metrics to choose from http://joey711.github.io/phyloseq/plot_richness-examplesBelow is are just a few beta diversity metrics choose from > library(phyloseq)
> unlist(distanceMethodList)
UniFrac1 UniFrac2 DPCoA JSD vegdist1 vegdist2
"unifrac" "wunifrac" "dpcoa" "jsd" "manhattan" "euclidean"
vegdist3 vegdist4 vegdist5 vegdist6 vegdist7 vegdist8
"canberra" "bray" "kulczynski" "jaccard" "gower" "altGower"
vegdist9 vegdist10 vegdist11 vegdist12 vegdist13 vegdist14
"morisita" "horn" "mountford" "raup" "binomial" "chao"
vegdist15 betadiver1 betadiver2 betadiver3 betadiver4 betadiver5
"cao" "w" "-1" "c" "wb" "r"
betadiver6 betadiver7 betadiver8 betadiver9 betadiver10 betadiver11
"I" "e" "t" "me" "j" "sor"
betadiver12 betadiver13 betadiver14 betadiver15 betadiver16 betadiver17
"m" "-2" "co" "cc" "g" "-3"
betadiver18 betadiver19 betadiver20 betadiver21 betadiver22 betadiver23
"l" "19" "hk" "rlb" "sim" "gl"
betadiver24 dist1 dist2 dist3 designdist
"z" "maximum" "binary" "minkowski" "ANY" With so many metrics to choose from, how do you know which is the "best" and how will your data affect the calculation of these metrics? Proposed Project: Create an interactive Shiny application to show changes in your chosen alpha or beta diversity metrics to see how each change based on simulated or real data. Some of these metrics are sensitive to single or double counts of species so this will be good to see how different distributions of counts will change these metrics and your interpretations of them. This should be designed to give an intuitive understanding of how these metrics work. Possible Requirements:
|
@erictleung Hi Eric, we approve your change in project. We are looking forward to this new one! |
Assignments are out, really looking forward to collaborating in this 👍 |
@ampatzia thanks for your interest! I've created a bare repository for put this project. I plan on getting a base Shiny application up for people to get up and running later this week, along with some ideas of what could be in the application itself. If I come up with anything else, I'll let you know! 😄 |
Some good articles to use while working on this project will be http://shiny.rstudio.com/articles/. It has lots of content on getting started, building the structure, frontend and backend sides of the application, and improving it. |
Hey team lead, we've been gathering Github IDs for your team members. We see that you've already started a repo for this project. So could you please add the following people as collaborators to that project? aimirza Once the people are added, it'd be a great idea to start a discussion on that repo with information to get your team members started (e.g. some small suggested reading, things to look up, etc). We will also be adding everyone to Slack and creating a specific channel for each project. This may be an easier way to communicate. We'll forward on any remaining Github IDs through this issue. Thanks, Jake |
@jakelever thanks! |
Hi, one more Github ID for you: cabrerad Thanks, Jake |
And one last one: scatcher125 Cheers, Jake |
@jakelever both added! Thanks for the update. |
And actually one more Github ID: szhan |
Develop an interactive application to facilitate informed sequencing quality control decisions for downstream analysis on many samples
There's the saying of "garbage in, garbage out" in computer science where the quality of your input influences downstream analyses. Genome sequencing has decreased in cost and so experiments can have many more samples. Manually checking each sample can be time consuming, and less precise. So I propose the development of web application or tool where you can drop in your samples and interactively explore the quality of your samples. This tool could be built by various means. One option would be to develop a Shiny R application, which would require knowledge of R, the Shiny package, and possibly HTML/CSS/JavaScript. Another would be to rely on web development standards (HTML/CSS/JS) to build something like an Electron application for cross browser compatibility and be user friendly. This idea stems from my experience dealing with 16S rRNA sequencing samples. I had a single experiment collect about 200 samples, with a total of about 400 samples for paired end sequencing. Manually viewing all 400 samples is time consuming. Additionally, further analysis of sequencing reads typically require some trimming based on the quality diminishing with longer reads. This tool could also be designed to recommend an ideal trim length based on your specifications of a hard threshold trimming all samples this length, or a dynamic threshold per sample basis. This trimming parameter will depend on the downstream tools used if they can handle such varying read lengths.
Team Lead: Eric Leung | [email protected] | @erictleung | Grad Student | Oregon Health & Science University, USA |
The text was updated successfully, but these errors were encountered: