I downloaded a bunch of RNA-seq dataset from Genbank and used tophat (with Bowtie2) to map back to the genome (to create a gtf file). The data are paired-end.
Out of the 5 samples, three of them gave a strange alignement result. The right read would realign 90% of the reads BUT only 0.3% of the left reads. Obviously, I downloaded the data 3 times, made sure it was not an error from my end.
So something is wrong with the left reads, using bowtie I can realign single end with 90% for both files, and 60% on paired end data. But tophat would only work for the right reads with paired end.
I was wondering if you had any ideas what could be causing the problem, I would like to contact the authors but with a good explanation of what the issue could be.
Thanks for your help