Mothur alignment issues
1
0
Entering edit mode
6.7 years ago
BrINClHOF • 0

Hi all, I'm currently using mothur for my first bioinformatics project and have ran into a problem when aligning to the SILVA database using the align.seqs command. When I put it in the command it seems to run for eternity, 15 hours before giving up, and gives no results on a 213k base pair sequence. I was wondering if anybody else has had this problem or has any tips for how to get around this issue. Thanks in advance for the help.

alignment • 2.6k views
ADD COMMENT
0
Entering edit mode

Your query sequence is 213k long?

ADD REPLY
0
Entering edit mode

I believe so, I'm still new so my terminology may be shaky. The initial sequence length in my contigs was 460k and I trimmed it down to 213k I'll post what it gives me to clarify:

Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... DONE.
It took 2044 to read  213119 sequences.
DONE.
It took 2050 to read  213119 sequences.
DONE.
It took 2058 to read  213119 sequences.
DONE.
It took 2075 to read  213119 sequences.
DONE.
It took 2185 to read  213119 sequences.
DONE.
It took 2187 to read  213119 sequences.
DONE.
It took 2194 to read  213119 sequences.
DONE.
It took 2199 to read  213119 sequences.
DONE.
It took 2238 to read  213119 sequences.

But after this it continues to spit out numbers for hours.

ADD REPLY
0
Entering edit mode

OK, you have 213k sequences which is fine. Can you print the command line you used?

ADD REPLY
0
Entering edit mode

I copied and pasted the command line with the previous 3 commands along with their outputs. The align.seqs command begins in my previous comment. Thanks for reaching out I appreciate it.

mothur > 
pcr.seqs(fasta=silva.nr-v132.align, start=8000, end=27000, keepdots=F, processors=10)
Using 10 processors.
mothur > 
system(copy silva.nr_v132.pcr.align silva.v4.align)
1 file(s) copied.
mothur > 
summary.seqs(fasta=silva.v4.align)

Using 10 processors.

        Start   End NBases  Ambigs  Polymer NumSeqs
Minimum:    32  13967   350 0   3   1
2.5%-tile:  183 18996   456 0   4   5328
25%-tile:   183 18996   465 0   4   53280
Median:     183 18996   483 0   5   106560
75%-tile:   183 18996   489 0   6   159840
97.5%-tile: 183 18996   643 1   7   207792
Maximum:    2264    18997   1705    5   24  213119
Mean:   183.05  18996   492.466 0.0669391   5.03908
# of Seqs:  213119

Output File Names: 
silva.v4.summary

It took 49 secs to summarize 213119 sequences.
mothur > 
align.seqs(fasta=dnr.trim.contigs.good.unique.fasta, reference=silva.v4.align)

Using 10 processors.
ADD REPLY
0
Entering edit mode
6.6 years ago
BrINClHOF • 0

I have tried a few different options to get around this error, the only one that worked was using mothur in linux instead of windows and have been able to process the entire pipeline smoothly ever sense.

ADD COMMENT

Login before adding your answer.

Traffic: 1671 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6