Extracting rRNA sequences from illumina metagenomic dataset
2
0
Entering edit mode
6.7 years ago
marcelelaux ▴ 20

Hello

I'm trying to extract rRNA sequences from my metagenomic datasets in order to perform diversity analysis on qiime. The overall purpose is compare the microdiversity (one target species) pointed by MUMmer, using the entire dataset and the microdiversity pointed by qiime diversity analysis.

So I chose rRNASelector and Metaxa to extract the sequences.

I'm having trouble to run rRNASelector on unix server through ssh.
It starts, the dialog box opens on my screen, I choose the reads file (fasta), but then I get this error message:
"Starting hmmsearch for forward using lib/16s_bact_for3.hmm
Not right file format! -> Abort"

My hmmer is ok, when I type hmmsearch -h it gives to me the help information. 
I've tried to run it choosing all possible paths for hmmsearch, but it just crashes. Besides this, the dialog box dynamics is so slow, I cant either change the length parameter...

In parallel, I'm running Metaxa, and it seems to work well, but for my metagenomic dataset it is finding too few sequences. Is a poor richness dataset, in fact, but even though I was expecting more. My output for bacteria sequences is 61 reads of a total of 110504 sequences. This sequences are from Illumina paired end, merged through Flash, with a average combine of 81% and average length of 293

The sequences used for rRNASelector are the same

If someone have experience with this kind of workflow, I appreciate every suggestion, because I dont have much experience, and maybe I am missing something, or doind something wrong.

Thank you so much!

rRNASelector Metaxa qiime • 2.4k views
ADD COMMENT
0
Entering edit mode

If you copy/paste your exact rRNASelector commands there is a higher chance of getting help.

Are you using the command-line or a graphical session over ssh? Graphical Xserver session over ssh may be really slow.

ADD REPLY
0
Entering edit mode

61 reads out of 110,000 might not be too far off what you expect. I have had 1 in 1,000 reads from a metagenome map to SSU rRNA, and that was after removal of the human sequences from an oral sample. Although maybe you left off a digit or two from the number of starting reads?

ADD REPLY
0
Entering edit mode
6.7 years ago
marcelelaux ▴ 20

Hello, yes, I am trying to use the graphical session over ssh. I just typed java -jar RNAselector.jar and then the dialog box opens but it's really slow.... its not working... Could I run it just from command line? it would be great!

Thank you

ADD COMMENT
0
Entering edit mode

You could try EMIRGE. Indeed, it seems rRNASelector is a GUI-only application.

ADD REPLY
0
Entering edit mode
6.7 years ago
marcelelaux ▴ 20

Thank you for all suggestions!

I am working with the output from Metaxa2, since it seems pretty accurate.

This is my extraction summary, I would be grateful if you to comment and tell me what you think these ratios:

I have two datasets of metagenomic sequences (Illumina paired ends, demultiplexed). Each dataset has 2 samples. The input for Metaxa2 pipeline was the original fastq files (R1, R2) for each sample. My extraction summary for bacterial SSU rRNA:

For the first dataset, a oligotrophic freshwater sample (very low DNA content) I got 149 SSU rRNA sequences from a total of 135069 sequences for one sample and 399 SSu rRNA sequences from 416595 for another sample. For the second dataset, a hypereutrophic freshwater sample (a lot of DNA..) I got 2070 rRNA sequences from a total of 3008033 from one sample and 1448 from a total of 2405944 for another sample.

Thank you for any comment!!

ADD COMMENT

Login before adding your answer.

Traffic: 2633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6