Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project/Package Idea: Multivariate outliers in genome scans #9

Open
DrK-Lo opened this issue Mar 8, 2015 · 7 comments
Open

Project/Package Idea: Multivariate outliers in genome scans #9

DrK-Lo opened this issue Mar 8, 2015 · 7 comments

Comments

@DrK-Lo
Copy link

DrK-Lo commented Mar 8, 2015

I have some code that can be used to identify outliers in multivariate space. I could see this as a way to combine results from multiple test statistics in genome scans (i.e. FST, genetic-environment association, genotype-phenotype association) to get multivariate outliers, rather than relying on outliers from individual statistics (note that I also think there are some caveats to this approach). I can see this feasibly developed into a package during the hackathon, and would love to try it on some simulated or real datasets. If anyone thinks this is a good idea, let me know.

@darencard
Copy link

I would certainly be interested in something like this and haven't really seen anything that attempts it. It would be nice to have some way of summarizing the various genetic statistics (Fst, Tajima's D, etc.) we use in sliding-window analyses (usually you'll see numerous plots for each one, which grows fast with Fst between multiple populations).

@zkamvar
Copy link

zkamvar commented Mar 9, 2015

Is there a link to the description of the method?

@thibautjombart
Copy link
Contributor

Cool stuff. I have had request for exactly that kind of thing on the
adegenet forum. You're most welcome to join the adegenet development team
and put this there. Also happy to help if you go for a separate package -
ensuring data class compatibility etc.

Cheers
Thibaut

On Mon, Mar 9, 2015 at 1:35 AM, Zhian N. Kamvar [email protected]
wrote:

Is there a link to the description of the method?


Reply to this email directly or view it on GitHub
#9 (comment)
.

@DrK-Lo
Copy link
Author

DrK-Lo commented Mar 9, 2015

I just wrote a blog post outlining two approaches to identify multivariate outliers.

Thibaut - I am an intermediate R user and really have no idea about data classes. I'm all about compatibility though!

@thibautjombart
Copy link
Contributor

Katie - I like the material you put on the blog!
Very happy to discuss stuff with you next week.
If you have any toy dataset with some outliers, that will probably be super
useful.

On Mon, Mar 9, 2015 at 1:06 PM, Katie Lotterhos [email protected]
wrote:

I just wrote a blog post outlining two approaches to identify
multivariate outliers
https://sites.google.com/site/katielotterhos/opennotebooks/k-lo/multivariateoutliersingenomescans
.

Thibaut - I am an intermediate R user and really have no idea about data
classes. I'm all about compatibility though!


Reply to this email directly or view it on GitHub
#9 (comment)
.

@DrK-Lo
Copy link
Author

DrK-Lo commented Mar 9, 2015

Thanks, I do have some published simulations that I tested them on at one point (Lotterhos and Whitlock 2014, 2015).
Would like to try it on some other simulations or real data if people are interested.

@smanel
Copy link

smanel commented Mar 9, 2015

I can provide real data to test it (already on the wiki) or in collaboration. Also happy to dicuss more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants