Orientation of reads in .fastq files from RNA-Seq
1
2
Entering edit mode
2.9 years ago
newbio17 ▴ 340

I have R1.fastq and R2.fastq from RNA-Seq experiment. I'm currently trying to perform differential expression using Salmon and DESeq2.

Since the RNA-Seq was done using stranded library kit, it was under my assumption that R1.fastq contains forward reads and R2.fastq contains reverse reads.

So, I used the following command to run Salmon:

salmon quant -i <index> -l ISF -1 <R1.fastq> -2 <R2.fastq> -o <output> --gcBias


This resulted in poor mapping rate (~3%) and confused me at first since I saw no issues with the sequencing data (e.g. no rRNA contamination, etc...).

Instead, when I tried with below command, I was able to reach ~90% mapping rate:

salmon quant -i <index> -l ISR -1 <R1.fastq> -2 <R2.fastq> -o <output> --gcBias


From the documentation, F/R of option -l tells Salmon which strand a read is coming from given a .fastq file:

F: read1 comes from the forward strand and read2 comes from the reverse strand

R: read1 comes from the reverse strand and read2 comes from the forward strand

So does this mean that in my R1.fastq, there are reverse strand reads?

RNA-Seq Salmon • 2.0k views
1
Entering edit mode

0
Entering edit mode

Thank you genomax. However, table seems a little bit confusing to me.

From what I'm able to understand, if library kit Illumina TruSeq Stranded Total RNA was used, then reads from reverse-strand belongs to 1st read strand? If this is correct, this is why I would have to use ISR instead of ISF?

2
Entering edit mode
2.9 years ago

Since the RNA-Seq was done using stranded library kit, it was under my assumption that R1.fastq contains forward reads and R2.fastq contains reverse reads.

That assumption was wrong, as you can see with your own data. In Truseq standed kits, read 1 is in the reverse orientation.