The RNAseq strand-specific library was constructed using Illumina's strand specific kit: TruSeq stranded sample prep kits, which is based on dUTP method. As the documentation on illumina (http://www.illumina.com/documents/products/technotes/RNASeqAnalysisTopHat.pdf), the library type should be "fr-firststrand". I have mapped the data using TopHat with library type "fr-firststrand". And then I check the output bam file using picard "CollectRnaSeqMetrics.jar " and "RSeQC-2.6.1/scripts/infer_experiment.py", which give me the results listed below:
First from picard: The first column is from: STRAND=SECOND_READ_TRANSCRIPTION_STRAND, and the second column is from: STRAND=FIRST_READ_TRANSCRIPTION_STRAND. It is surprised me that the SECOND_READ_TRANSCRIPTION_STRAND give more CORRECT_STRAND_READS, just contrary to my expectation.
SECOND_READ_TRANSCRIPTION_STRAND FIRST_READ_TRANSCRIPTION_STRAND CORRECT_STRAND_READS 14040054 138566
INCORRECT_STRAND_READS 138566 14040054
Second from RSeQC:
This is PairEnd Data Fraction of reads failed to determine: 0.0000
Fraction of reads explained by "1++,1--,2+-,2-+": 0.0446
Fraction of reads explained by "1+-,1-+,2++,2--": 0.9554
It seems that picard using different meaning of the first and second as TopHat and Cufflinks, isn't it?