sortmerna terminates without generating the file with rRNA being removed
1
1
Entering edit mode
3.3 years ago

Hello everyone

I need help with the sortmenrna tool. I used the command to run sortmerna to remove the RNA from the fastq files. However, when the walltime 1-12:00:00 was over, the command stops without generated the aligned file and the cleaned files. The reference file for human rDNA was downloaded from the following link The command used to run is as below

sortmerna --ref rRNA.fasta --reads $sample --aligned "${file}_rna" --other "${file}_clean" --threads 20 -v --fastx

human rDNA

I will highly appreciated if there could be some suggestions.

Thank you so much

alignment • 2.2k views
ADD COMMENT
0
Entering edit mode

Was 1-12:00:00 the time limit for your job? sortmerna can easily take longer than that in some cases. What were the last messages it printed to stderr and stdout?

ADD REPLY
0
Entering edit mode

Here is stderr

enter image description here

ADD REPLY
0
Entering edit mode

stdout

Here is stdout

ADD REPLY
0
Entering edit mode

Don't post screenshots of errors they are impossible to decipher. Please use pastebin.com to post the logs if they are long.

ADD REPLY
1
Entering edit mode

I am so sorry with the screen shot. Please find the link for log file being uploaded. log-file-sortmerna

ADD REPLY
0
Entering edit mode

Thank you for the logs and the screenshots. This is the log for the key-value database, which isn't very informative here.

Could you please run sortmerna like so and share smrlog.txt with us?

sortmerna --ref rRNA.fasta --reads $sample --aligned "${file}_rna" --other "${file}_clean" --threads 20 -v --fastx --workdir ${PWD} | tee -a ${PWD}/smrlog.txt

(Please note, this will put all the working files in the current directory you are in; so switch to an appropriate--perhaps empty--directory accordingly.)

Just to hasten the debugging process, I suggest you just subset ca. 100 reads or so at random from your fastq file (e.g., just take the first 100 reads; you might find SeqKit helpful here), and run sortmerna on those reads only.

ADD REPLY
0
Entering edit mode

I ran the command as suggested and the log-file was generated. The command stopped after running for 2 days of specified wall time without generating any results. I am looking forward for suggestions ahead. I used 56 processors to run the command. Please find the log-file in the link-below. log-file

ADD REPLY
0
Entering edit mode
3.3 years ago
Dunois ★ 2.5k

Some questions:

  • So you ran this on a compute cluster? How is sortmerna installed? Are you executing via srun or sbatch?
  • How is sortmerna installed?
  • Are your output paths actually accessible and writable?
  • Also try pointing --workdir to /data/shilpia2/NOR.sequecnes/temp/. (Could be that sortmerna has no access to ${PWD}.)
  • Please try the runs with a this test read set. With your rRNA reference, it should indicate 3 or 4 matches in aligned.log once sortmerna is done. The run itself shouldn't take any longer than a few seconds.

It doesn't make any sense that it does not execute at all. Since you're running this on slurm it would be nice if you could share both stderr and stdout.

ADD COMMENT
1
Entering edit mode

In answer to your question.

  1. I did run on HPC cluster and we use sbatch to run the command.
  2. I do not know how was it installed.
  3. yes the output paths are accessible and writable.
  4. I am pointing my --workdir to /data/shilpia2/NOR.sequecnes/temp/
  5. I will work on the test read set.
  6. I have a question which i guess i forgot to mention. Can we use sortmerna for DNA sequencing data as well. Beside rnaseq I do have DNA-sequencing data which has rDNA contamination.
  7. I am using reference DNA from the following link. rDNA reference. Do we also need to create index file ? I could not create index file file because indexdb_rna command was not working with the sortmerna.

Thank you so much.

ADD REPLY
1
Entering edit mode

I run the test file and it has been executed as per the suggestion and I have the results attached to this link test_rna_results. I have also included both stderr and `stdout' in the uploaded file. I would like to know if the command was executed correct?

My next question is can we use sortmerna to remove rDNA from DNA-sequencing data?

ADD REPLY
1
Entering edit mode

Hmm looks like the test run executed properly without a hitch (take a look at the log file). I can only speculate, then, that there are issues with your input fasta/fastq file. I noticed that it is compressed. Perhaps try decompressing the file first before feeding it to sortmerna? It is plausible that the tool is not handling the compressed file properly (even though it should).

As for your point 6: I think it'll work on any data as long as the alphabet in the reference and the input match. (You should confirm this with the developers, but I don't see any reason why this wouldn't be the case.)

Regarding point 7: I don't think you need to create any index files. The execution syntax I had indicated in the test run should suffice for all cases.

And I think your other question (in the comment I am replying to) is addressed by my response to point 6.

ADD REPLY
0
Entering edit mode

Thanks you so much for your response.

ADD REPLY
0
Entering edit mode

You're welcome. Let me know how it goes!!

ADD REPLY
0
Entering edit mode

Hi

I was able to run sortmerna for filtering rDNA from DNA-sequencing data. It is required to unzip the fastq file to run the sortmerna. I did take 100 reads and was able to filter out rDNA from the sequencing data.

Thanks

ADD REPLY
1
Entering edit mode

So it was just the compressed file then? I hope everything goes smoothly hereonforth.

ADD REPLY
1
Entering edit mode

Yes the problem was just with the compressed file. Thank you so much.

ADD REPLY
2
Entering edit mode

I have moved a comment (that was able to keep the flow of the thought process) to an answer. You can accept it providing closure to this thread.

ADD REPLY

Login before adding your answer.

Traffic: 2583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6