Question: Bowtie2 end-to-end alignment seems does not work for Fasta files?
1
gravatar for m.koohi.m
23 months ago by
m.koohi.m110
United States
m.koohi.m110 wrote:

Hi, I use bowtie2-2.2.9 to align Fasta reads with some genes. I don't know I understand correctly end-to-end alignment in Bowtie2 or not. Based on my understanding if we have a read same as bellow:

>Read1
TGCGGAATTTGATACACGTACATAAGTACGTGTTGGCTTATGCTTGCGTACGCTGAAACATGCTGACCTTTTTTTAAAACGCCCTTGTC

And we use end-to-end (it seems default option) in our alignment the aligning should involves all the characters in the read. But in my result I have some local aligning. Same as bellow that just use 8 character of read in alignment.

Read1   16  Gene.1  19  1   8M  *   0   0   TAAAAAAA    IIIIIIII    AS:i:0  X

I run the Bowtie command with these options:

bowtie2 -f -x RefGene -U merged.fasta -S output.txt -p 6 --no-hd --no-sq --no-unal

Also I am sure that the length of Read1 is longer than 8. It is 88. I am wondering if I need to add any option in running bowtie2 to force it to align end-to-end?

ADD COMMENTlink modified 23 months ago • written 23 months ago by m.koohi.m110
2

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLYlink modified 23 months ago • written 23 months ago by WouterDeCoster39k
1

Any specific reason to use bowtie2 with fasta reads, instead of something like blat? You should also include exact command line you used for the bowtie2 alignment to provide full context.

ADD REPLYlink modified 23 months ago • written 23 months ago by genomax67k

I always thought that bowtie2 aligned end to end by default and that you would need to pass extra parameters to make it work differently.

In your example note how you don't have clipping on the CIGAR string. This implies that your original sequence was just 8bp long. But then I don't think bowtie2 actually works for sequences that short. In addition the reported sequence cannot be found in the read that you wrote above. So ... lots of inconsistencies there...

Show both the command that you are running and the actual line that gets reported.

ADD REPLYlink modified 23 months ago • written 23 months ago by Istvan Albert ♦♦ 80k

@WouterDeCoster Thanks for suggestion and tutorial.

@genomax I am processing metagenomics files and found Bowtie2 too fast. I really didn't tried Blat. You think it is as fast as Bowtie2?

@Istvan Albert Thanks for your comment. Actually I am sure that the length of read is not 8. It is 88. I updated my question.

ADD REPLYlink written 23 months ago by m.koohi.m110

Your alignment shows a sequence TAAAAAAA that is not present in the read that you show.

In addition when an alignment takes place aligners will indicate how much of the read is clipped with the S or H letters. It is strange that your SAM does not do that. In addition the alignment line that you report is incomplete, note how it ends with X and does not show an MD tag.

You should show the complete SAM record and show the complete input sequence. Right now it still looks like some sort of inconsistency regarding either the data or the alignment. Hence we cannot troubleshoot it.

ADD REPLYlink modified 23 months ago • written 23 months ago by Istvan Albert ♦♦ 80k

16 in second position of SAM record means the alignment is reverse. Reverse of "TAAAAAAA" is "TTTTTTTA" that present in read.

ADD REPLYlink written 23 months ago by m.koohi.m110

Ah indeed, good point, the sequences are always reported on the forward strand. I missed that.

ADD REPLYlink written 23 months ago by Istvan Albert ♦♦ 80k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 806 users visited in the last hour