consistent problem of strand-specific information between TopHat/RSeQC/picard
1
1
Entering edit mode
10.4 years ago
pengchy ▴ 460

The RNAseq strand-specific library was constructed using Illumina's strand specific kit: TruSeq stranded sample prep kits, which is based on dUTP method. As the documentation on illumina (http://www.illumina.com/documents/products/technotes/RNASeqAnalysisTopHat.pdf), the library type should be "fr-firststrand". I have mapped the data using TopHat with library type "fr-firststrand". And then I check the output bam file using picard "CollectRnaSeqMetrics.jar " and "RSeQC-2.6.1/scripts/infer_experiment.py", which give me the results listed below:

First from picard: The first column is from: STRAND=SECOND_READ_TRANSCRIPTION_STRAND, and the second column is from: STRAND=FIRST_READ_TRANSCRIPTION_STRAND. It is surprised me that the SECOND_READ_TRANSCRIPTION_STRAND give more CORRECT_STRAND_READS, just contrary to my expectation.

                                SECOND_READ_TRANSCRIPTION_STRAND    FIRST_READ_TRANSCRIPTION_STRAND
CORRECT_STRAND_READS            14040054                            138566
INCORRECT_STRAND_READS          138566                              14040054

Second from RSeQC:

This is PairEnd Data Fraction of reads failed to determine: 0.0000
Fraction of reads explained by "1++,1--,2+-,2-+": 0.0446
Fraction of reads explained by "1+-,1-+,2++,2--": 0.9554

It seems that picard using different meaning of the first and second as TopHat and Cufflinks, isn't it?

RNA-Seq strand-specific • 4.6k views
ADD COMMENT
5
Entering edit mode
10.4 years ago

Yes, the strand that's being mentioned is different in tophat/cufflinks than most other things. For tophat/cufflinks, it's the strand from cDNA construction that's being sequenced (i.e., either the first strand that's synthesized (fr-firststrand) or its reverse complement (fr-secondstrand)).

Almost everything else is talking about the DNA strand to which a read/pair aligns. I personally tend to think in terms of this rather than in the steps of library construction, but to each their own.

ADD COMMENT
0
Entering edit mode

Hi Devon Ryan, according to your explanation, the picard's strand information is from DNA, so their strand information are contrary to each other. It make sense. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 3417 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6