I have RNAseq samples (paired-end FASTQ) with their sequencing kit name but I don't have the information about the strandedness. Not sure what approach was used to capture the RNA (strand-specific or non-stranded approach).
I performed two different analyses,
1) Assuming stranded "reverse" for hisat2, htseqcount (output: Differentially Expressed genes around 850)
2) Assuming nonstranded for hisat2, htseqcount (output: Differentially Expressed genes around 1100)
- 94% of genes from stranded approach are matching with genes from non-stranded approach. - 6% of genes from stranded approach are not matching with genes from non-stranded appraoch.
Later, I encountered this RSeQC tool for identifying the strandedness.
Output of that tool:
- Fraction of reads failed to determine : 0.0580 - Fraction of reads explained by "1++,1--,2+-,2-+" : 0.4724 - Fraction of reads explained by "1+-,1-+,2++,2--" : 0.4695
I concluded this one to be : non-stranded. Am I correct?
Consider, if I proceed with the stranded approach ouput, is it a big blunder?