Question: Bowtie2 treats reads as not mapping even if the reads have exact same sequences with reference?
2
gravatar for shl198
6.5 years ago by
shl198410
United States
shl198410 wrote:

Hi all,

I aligned my RNA-seq against reference genome using tophat, I used the default aligner bowtie2. 

And also the default parameters: 

tophat -p 8 -G $annotation -o out $database L1_1.fq.gz L1_2.fq.gz

After got the results, I found out that in the unmapped.bam file, some reads have exact same sequences with the reference. The follow is one line in the unmapped.sam file:

DGZN8DQ1:360:H9RN8ADXX:1:1101:4791:1895 69      *       0       255     *
       *       0       0       TTTTGCTTTCTGACTCTGTGCTTGTGCCTTCAAGACTTTCACAACGATTTTCTGCTCCTCAATAAGGAAAGCCCGAGATCGGAAGAGCACACGTCTGAAC    CCCFFFFFHHHHHJJJJJJJIJJJHIJJJJJIJJJIJJJJIJJJJJIJJJJJJJJJJJJIJIJJJJJIJJJJJJHHFFDEDDDDDDDDDDDDDDDDDCCD

Does anyone know why the bowtie2 doesn't treat those reads as mapped? Thanks

rna-seq tophat quality bowtie2 • 2.5k views
ADD COMMENTlink modified 6.5 years ago by Devon Ryan98k • written 6.5 years ago by shl198410
2
gravatar for Devon Ryan
6.5 years ago by
Devon Ryan98k
Freiburg, Germany
Devon Ryan98k wrote:

Dirty little secret: bowtie2 doesn't always find exact matches. If you change the order of reads in a file you'll sometimes get different alignment results for them. I've never bothered to find the reason, since this ends up affecting very few reads.

ADD COMMENTlink written 6.5 years ago by Devon Ryan98k

Hi Devon, thank you very much. I just tried mapping using bowtie2 directly instead of tophat, the result increased a little, and I also blast the unmapped reads, most of them mapped to mouse ribosomal RNA.

I didn't change the annotation file, and I made sure there are rRNA reference in the gff file. In this case, the reads should map to the reference, but they didn't.

So my guess it that tophat can filter rRNA reads automatically? Do you have any experience about this? Thank you very much.

ADD REPLYlink modified 6.5 years ago • written 6.5 years ago by shl198410
1

Perhaps, but it's more likely that the reads map so many times that they're discarded. There are enough copies of rRNA in the genome that this could be the case. I should add that I don't use tophat anymore, it's just too painfully slow. Give STAR a try if you have enough RAM.

ADD REPLYlink written 6.5 years ago by Devon Ryan98k

Thank you very much. I will try STAR. 

ADD REPLYlink written 6.5 years ago by shl198410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2402 users visited in the last hour
_