Question: Tophat, low alignement left read
0
gravatar for claire.morandin
3.5 years ago by
Finland
claire.morandin0 wrote:

Hello,

I downloaded a bunch of RNA-seq dataset from Genbank and used tophat (with Bowtie2) to map back to the genome (to create a gtf file). The data are paired-end.

Out of the 5 samples, three of them gave a strange alignement result. The right read would realign 90% of the reads BUT only 0.3% of the left reads. Obviously, I downloaded the data 3 times, made sure it was not an error from my end.

So something is wrong with the left reads, using bowtie I can realign single end with 90% for both files, and 60% on paired end data. But tophat would only work for the right reads with paired end.

I was wondering if you had any ideas what could be causing the problem, I would like to contact the authors but with a good explanation of what the issue could be.

Thanks for your help

 

 

 

 

rna-seq alignment tophat • 1.2k views
ADD COMMENTlink modified 3.5 years ago by Antonio R. Franco4.0k • written 3.5 years ago by claire.morandin0
0
gravatar for michael.ante
3.5 years ago by
michael.ante3.2k
Austria/Vienna
michael.ante3.2k wrote:

First, I'd check the raw-data with FastQC in order to control for low-quality tails and adapter contamination.

Second, I'd check the inner-distance with a subset of reads (aligning with bowtie2 and RSeQC's inner_distance.py). You should adjust the Tophat's  --mate-inner-dist and --mate-std-dev parameters. Tophat is trying to learn the inner distance distribution, but providing theses data increases accuracy.

ADD COMMENTlink written 3.5 years ago by michael.ante3.2k

Thanks a lot, so I checked FastQC from the start and the reads are of high quality and no trace of adaptors. 

I am going to follow your second advice and see!! 

ADD REPLYlink written 3.5 years ago by claire.morandin0
0
gravatar for Antonio R. Franco
3.5 years ago by
Spain. Universidad de Córdoba
Antonio R. Franco4.0k wrote:

Take a look the type of library setting in the analysis. I refer to the --library-type setting.

ADD COMMENTlink written 3.5 years ago by Antonio R. Franco4.0k

Thanks , I am rerunning Tophat with the library type argument. As Tophat works for some samples and not these ones, I am afraid the issue might be more complex!

ADD REPLYlink written 3.5 years ago by claire.morandin0

Have you trimmed your sequences ?

Are you using two separate files?. One for the left and the other for the right paired sequences ?

If so, I can recall that some aligners are expecting both files to have the same number of reads, and also ordered.

If you erase a read in one of the files, and no its mate in the other, chances are that you leave an orphan read that screw up all of the paired sequences after it. In other words. It is likely that your files must be synchronized

Don't know for sure if bowtie is one of these aligners

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by Antonio R. Franco4.0k

That's a good point, I am going to check that out!! Thanks a lot

ADD REPLYlink written 3.5 years ago by claire.morandin0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1137 users visited in the last hour