I have had a question the until now I could not find a clearly answer. I have some data for RNA seq that I have aligned. Observing the alignment done by STAR in IGV, I see things like this one:
My question here is: Which is the right direction of the transcript that produces the reads, knowing that the RNA-seq has been done in Single end?
I am little confused here because : If I look for the 3 first reads (at the top in blue) into my fastq file of RNA seq reads, these reads are:
@SRR.139896188 TCTGCAGACACCCTTCTCCGGCCGGAGCTG + AAAAAFF7FAFFFFFFFFFFFFFFFFFFAF @SRR.107479696 TCTGCAGACACCCTTCTCCGGCCGGAGCTG + AAAA<FFAFAFFFFFFFFFFAFFFFFFFFF @SRR.40743323 TCTGCAGACACCCTTCTCCGGCCGGAGCTG + AAAAAFF)FAFFFFFFFFFFFFFFFFFFFF
But in my alignment file (sam file) the reads look like this :
SRR.139896188 16 chr1 136042 255 30M * 0 0 CAGCTCCGGCCGGAGAAGGGTGTCTGCAGA FAFFFFFFFFFFFFFFFFFFAF7FFAAAAA NH:i:1 HI:i:1 AS:i:27 nM:i:1 SRR.107479696 16 chr1 136042 255 30M * 0 0 CAGCTCCGGCCGGAGAAGGGTGTCTGCAGA FFFFFFFFFAFFFFFFFFFFAFAFF<AAAA NH:i:1 HI:i:1 AS:i:27 nM:i:1 SRR.40743323 16 chr1 136042 255 30M * 0 0 CAGCTCCGGCCGGAGAAGGGTGTCTGCAGA FFFFFFFFFFFFFFFFFFFFAF)FFAAAAA NH:i:1 HI:i:1 AS:i:27 nM:i:1
So, the reverse complement of these reads is aligned in the genome Is this meaning that the direction of the transcript is the reverse sens of the genome ?
For these blue reads, here the preparation library protocol used:
Fragment polyadenylation, Linker ligation to do after reverse transcription, Then circularization to deplete ribosomal RNA, Next PCR- Amplification and Sequencing - single layout on NextSeq 500
What if I look again in the image : the red reads are aligned in the same sens as they are in the fastq (RNA reads). So, I might say that the transcript direction is the actual direction of the genome showed in the image?
For these red reads, here the preparation library protocol used:
cDNA synthesis and library generation performed according to TruSeq Stranded Total RNA Sample Preparation protocol (Illumina). mRNA-seq libraries was subjected to sequencing on a NextSeq 500 instrument (Illumina) to yield 75 bp single-end reads.
I hope to be clear enough to have a clear answer. :)
Thanks in advance.