-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build ECM model #22
Comments
Since Mike hasn't responded...
One approach would be to use the PhyloCSF Omega method, which does not require a precomputed ECM model: https://github.com/mlin/PhyloCSF/wiki/Omega-Test
This is slower and not as accurate as using PhyloCSF with a precomputed ECM model, but might be adequate.
Another alternative is that I might be able to (eventually) compute an ECM model for you.
- On what plant clade are you working?
- Do you have multiple alignment files for the whole genome in MAF format?
- What is your time frame?
… On Apr 11, 2018, at 4:45 AM, AlexWanghaoming ***@***.***> wrote:
Dear Doc Lin:
If I want to use PhyloCSF in plants,how do I build ECM model for my species?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#22>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABPD0QnD255aLX02uSR4dBRPHtyU97lPks5tncKVgaJpZM4TPm6h>.
|
Thanks for your reply. |
PhyloCSF is intended to be used on many-species alignments, such as 12-flies or 29-mammals. As far as I know, it has never been used on alignments of only two species, particularly two as far apart as Arabidopsis and Zea (which diverged about 125 mya: https://en.wikipedia.org/wiki/Monocotyledon#Evolution); it needs to have species that are far enough apart to have many synonymous substitutions in the region you are looking at, but not so far apart that the same site has changed more than once since divergence. I doubt that an alignment of just those two species would be adequate to distinguish sORFs.
If you can't get alignments with more species, but are determined to try to use Arabidopsis and Zea, my suggestion is to test the PhyloCSF omega method on a set of known coding and non-coding regions of the approximate length you are looking for to see how well it can distinguish them. Make sure to use coding and non-coding regions that are covered by the alignment -- the results will definitely not be informative on regions for which there is no alignment, and only 2% of the Zea genome and 16% of the Arabidopsis genome is covered (http://plants.ensembl.org/mlss.html?mlss=9461).
… On Apr 18, 2018, at 12:20 PM, AlexWanghaoming ***@***.***> wrote:
Thanks for your reply.
I have download the pairwise alignment of Arabidopsis and Zea mays in Ensembl Plants(MAF fomat),My goal is to find conserved sORF in the two plants above.Could you please help me to build the ECM? What's the language you use?Could you please offer me the script?
Thank you again!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#22 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABPD0TRsSaZY35Rff8-ld84thjVFWLyUks5tp2fMgaJpZM4TPm6h>.
|
Dear Dr. Jungreis: |
PhyloCSF requires a multi-species alignment of species in the right range of genomic distance (for example, 29 placental mammals). I am not aware of such an alignment for sharks.
Do you have such an alignment?
If not you will not be able to use PhyloCSF. In that case you might try CPAT (Wang et al., Nucleic Acids Research, 2013). It does not require an alignment, but I would not expect it to do as well at finding short ORFs or ORFs in transcripts with incorrect transcript structure (missing exons, etc.).
… On May 20, 2019, at 4:17 AM, lanyuchunmo ***@***.***> wrote:
Dear Dr. Jungreis:
I have a similar question, I have been analysed RNA-seq data, and identified many candidate lncRNA (40000+), now I want to filter this data set by using PhyloCSF. However, my species is "bamboo shark", which didn't contained in any reference data sets provided by the present release of PhyloCSF. Does this mean that I can't use this software to complete the task? Is there any other solution?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#22?email_source=notifications&email_token=AAJ4HUJRQGFU23JTYUWXOJTPWJM73A5CNFSM4EZ6N2Q2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVYBMHI#issuecomment-493884957>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJ4HUL5YOQMHSX6SDOFJR3PWJM73ANCNFSM4EZ6N2QQ>.
|
Dear Dr. Jungreis: |
Creating your own alignments is theoretically possible, though it would be a lot of work (and I'm not an expert in that so I couldn't help you). The smallest tree that I've used for PhyloCSF has 7 species. You will want a total phylogenetic branch length more than 1 neutral substitution per site, but individual branches a lot less than 1 neutral substitution per site. You can get a sense of the kind of phylogenetic tree that would be adequate by clicking on the links under "Available phylogenies" here: https://github.com/mlin/PhyloCSF/wiki; the trees are labeled with a scale in neutral substitutions per site.
I am not an expert in CPAT, so I'd suggest you direct any questions about it to its authors.
… On May 21, 2019, at 3:17 AM, lanyuchunmo ***@***.***> wrote:
Dear Dr. Jungreis:
I have no such alignment, if I can find transcriptome data for some closely related species, can I make one by myself?
I test CPAT online, but the web server just support Human (hg19), Mouse (mm9 and mm10), Fly (dm3) and Zebrafish (Zv9). Can I use zebrafish as a reference for my own species to complete the analysis?
I noticed that CPAT can customize analysis parameters for any species, but the prerequisite is that there are coding gene sequences and non-coding gene sequences of the species. However, my purpose is to identify non-coding RNA, so I don't have a shark's non-coding RNA yet.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#22?email_source=notifications&email_token=AAJ4HUNFU5NMSD2TMOFICNDPWOOZTA5CNFSM4EZ6N2Q2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODV27OMI#issuecomment-494270257>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJ4HUKLHNXS3N2AHXROVXLPWOOZTANCNFSM4EZ6N2QQ>.
|
Dear Doc Lin:
If I want to use PhyloCSF in plants,how do I build ECM model for my species?
The text was updated successfully, but these errors were encountered: