Question: Parsing a Megan (RMA) File
0
gravatar for irazoqui.matias
2.8 years ago by
irazoqui.matias10 wrote:

Hi,

I having a problem with MEGAN 5. I'm working with some quite large RMA files (40 Gb aprox), built using the trimmed reads and a blast run. The problem is that my (poor little) PC always hangs up when I try to open them. So, I was wondering if there's a way to parse a file via Megan command line, dividing it in "Bacteria", "Archaeas" & "Viruses" or whatever, so the files become a little bit smaller. Thanks!

ADD COMMENTlink modified 2.8 years ago by Charles Warden6.6k • written 2.8 years ago by irazoqui.matias10
0
gravatar for Charles Warden
2.8 years ago by
Charles Warden6.6k
Duarte, CA
Charles Warden6.6k wrote:

I'm not very familiar with MEGAN, but I would imagine a BLAST search could get quite time-consuming if searching a database of all known metagenomics sequences, especially if you have over 1 million reads.

Did you amplify ribosomal RNA sequences? If so, these are some programs that should provide less computationally intensive options:

1) RDPclassifier (web-based or local .jar file) - https://rdp.cme.msu.edu/classifier/classifier.jsp

2) mothur - http://www.mothur.org/

3) QIIME - http://qiime.org/

They don't work with RMA files, but you must have had some sort of sequence to produce the RMA file. If you have .fastq files, mothur and QIIME can take those as an input (and convert to .fasta file, if you wanted to try the RDPclassifier as a standalone tool).

ADD COMMENTlink written 2.8 years ago by Charles Warden6.6k

Thanks Charles for replying. Yes, I have the 16S sequences and I've already used QIIME. But now I'm working on the WGS reads and I wanted to do another taxonomical analysis (beacuse by using 16S, you leave behind viruses and eukaryotas). That's why I tried MEGAN. For the BLAST part, I used DIAMOND, which is waaay faster than regular BLAST (although, each search takes almost a day). I got that one covered, but the results I get are killing my PC (still waiting for budget approval to buy a new one...)

ADD REPLYlink written 2.8 years ago by irazoqui.matias10

Ok - I haven't tested any of the following programs, but it is possible that it might help you to use a different method to quantify species abundances that doesn't depend upon your BLAST file:

http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0997-x

http://www.ccb.jhu.edu/software/centrifuge/

This one is really for transcriptomes, but it might still work:

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0969-1 http://taxonomer.iobio.io/

ADD REPLYlink written 2.8 years ago by Charles Warden6.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1189 users visited in the last hour