Question: Mothur alignment issues
0
gravatar for BrINClHOF
9 months ago by
BrINClHOF0
United States
BrINClHOF0 wrote:

Hi all, I'm currently using mothur for my first bioinformatics project and have ran into a problem when aligning to the SILVA database using the align.seqs command. When I put it in the command it seems to run for eternity, 15 hours before giving up, and gives no results on a 213k base pair sequence. I was wondering if anybody else has had this problem or has any tips for how to get around this issue. Thanks in advance for the help.

alignment • 742 views
ADD COMMENTlink modified 3 months ago by kaanokay52140 • written 9 months ago by BrINClHOF0

Your query sequence is 213k long?

ADD REPLYlink written 9 months ago by Asaf5.0k

I believe so, I'm still new so my terminology may be shaky. The initial sequence length in my contigs was 460k and I trimmed it down to 213k I'll post what it gives me to clarify:

Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... Reading in the silva.v4.align template sequences... DONE.
It took 2044 to read  213119 sequences.
DONE.
It took 2050 to read  213119 sequences.
DONE.
It took 2058 to read  213119 sequences.
DONE.
It took 2075 to read  213119 sequences.
DONE.
It took 2185 to read  213119 sequences.
DONE.
It took 2187 to read  213119 sequences.
DONE.
It took 2194 to read  213119 sequences.
DONE.
It took 2199 to read  213119 sequences.
DONE.
It took 2238 to read  213119 sequences.

But after this it continues to spit out numbers for hours.

ADD REPLYlink modified 8 months ago by genomax59k • written 9 months ago by BrINClHOF0

OK, you have 213k sequences which is fine. Can you print the command line you used?

ADD REPLYlink written 9 months ago by Asaf5.0k

I copied and pasted the command line with the previous 3 commands along with their outputs. The align.seqs command begins in my previous comment. Thanks for reaching out I appreciate it.

mothur > 
pcr.seqs(fasta=silva.nr-v132.align, start=8000, end=27000, keepdots=F, processors=10)
Using 10 processors.
mothur > 
system(copy silva.nr_v132.pcr.align silva.v4.align)
1 file(s) copied.
mothur > 
summary.seqs(fasta=silva.v4.align)

Using 10 processors.

        Start   End NBases  Ambigs  Polymer NumSeqs
Minimum:    32  13967   350 0   3   1
2.5%-tile:  183 18996   456 0   4   5328
25%-tile:   183 18996   465 0   4   53280
Median:     183 18996   483 0   5   106560
75%-tile:   183 18996   489 0   6   159840
97.5%-tile: 183 18996   643 1   7   207792
Maximum:    2264    18997   1705    5   24  213119
Mean:   183.05  18996   492.466 0.0669391   5.03908
# of Seqs:  213119

Output File Names: 
silva.v4.summary

It took 49 secs to summarize 213119 sequences.
mothur > 
align.seqs(fasta=dnr.trim.contigs.good.unique.fasta, reference=silva.v4.align)

Using 10 processors.
ADD REPLYlink modified 8 months ago by genomax59k • written 9 months ago by BrINClHOF0
0
gravatar for BrINClHOF
8 months ago by
BrINClHOF0
United States
BrINClHOF0 wrote:

I have tried a few different options to get around this error, the only one that worked was using mothur in linux instead of windows and have been able to process the entire pipeline smoothly ever sense.

ADD COMMENTlink written 8 months ago by BrINClHOF0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1453 users visited in the last hour