Bowtie alignment run too long and not finishing job
0
0
Entering edit mode
3.6 years ago
xiaoleiusc ▴ 140

Hi, All,

I have 6 datasets of small RNA libraries from Illumina Nextseq sequencing. I use Bowtie alignment program (not Bowtie2) to align reads to the hg19 human genome. While 5 out of 6 datasets finished Bowtie alignment within 30 minutes, 1 set of my data is running for 6 hours and seems never finishing. I would appreciate any input on how to fix this problem. I attach my command below, I use the same command for all my datasets.

bowtie -f -v 2 -m 2 --best --strata -p 8 -S /PATH/hg19_index input.data output.sam 2>&1 | tee output.log
RNA-Seq ChIP-Seq • 3.5k views
ADD COMMENT
1
Entering edit mode

There is no good answer for this question. If the process is producing output then obviously let it go. If you think it is hung then kill/restart.

ADD REPLY
0
Entering edit mode

Thanks Genomax. It initially has an output sam file with increasing file size as time goes, then the sam file size reached around 1.2G and not increased anymore, but the program never stop.

ADD REPLY
1
Entering edit mode

If you feel that the job is hung then restarting it would be a smart thing. Certainly if nothing has been added to output file in a couple of hours. Verify that there is no error thrown in the log before you restart.

ADD REPLY
0
Entering edit mode

Hi, Genomax, It did not generate a log file of the running (it generates a log file of statistics of mapped reads after the run finishes) so I don't know what is going wrong. However, if I cancel the job, take the sam file as input and run the following command:

samtools view -bS output.sam > output.bam

I can see the error message as below: [E::sam_parse1] SEQ and QUAL are of different length [W::sam_read1] Parse error at line 7662831 [main_samview] truncated file.

However, it still generates an output bam file for me, this bam file is incomplete and alignments to certain genomes are not there.

Thanks.

ADD REPLY
1
Entering edit mode

Sounds like you have a problem read in that file where the SEQ/QUAL are not the same length. You could use a program like this to find the problem read.

ADD REPLY
0
Entering edit mode

Thanks Genomax. I tried Bowtie hg19 index downloaded from iGenome and the problem solved. I also tried to index with bowtie-build of my hg19 fasta file, it also solved the problem. Somehow my bowtie index is not perfect.

ADD REPLY

Login before adding your answer.

Traffic: 2605 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6