Must paired-end reads be in the same order in the two files for input to Tophat?
1
1
Entering edit mode
7.0 years ago
gaelgarcia ▴ 230

Hi!

Will TopHat work if R1.fastq and R2.fastq have their reads in different order?

What if R1.fastq has some reads whose R2 mate did not make it past QC, and viceversa (R2.fastq has some reads whose R1 mate did not make it past QC)?

tophat rna-seq alignment sequencing • 3.5k views
ADD COMMENT
0
Entering edit mode

Cross-posted.

Please don't post the same question on multiple forums, as it wastes people's time.

ADD REPLY
0
Entering edit mode
Noted. Thank you! I was not aware of this rule.
ADD REPLY
1
Entering edit mode
7.0 years ago
Rob 5.1k

Paired-end reads should always be in the same order in both files when passed to an aligner.  Typically, if the aligner sees a read whose mate does not appear at the same position in the other file, it will, at best, ignore that reads mate and treat the read as single-end.  However, many aligners will actually complain (and possibly quit) when they encounter such a situation.  Most quality-control pipelines have the ability to output reads whose mate failed QC to a separate file.  The most common approach to deal with such reads is to give the aligner the QC-ed read pairs (in the exact same order in both input files) and then to provide a separate file of un-paired reads (e.g. reads whose mate failed QC).  Many aligners (including Tophat, I believe) allow you to specify a set of files for paired-end reads as well as a separate file for unpaired reads in the same run.

ADD COMMENT

Login before adding your answer.

Traffic: 2545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6