Question

Fr-unstranded vs Fr-first strand

0

Entering edit mode

7.8 years ago

clizama • 0

Just curiosity:

I know some Library preps kits are Strand specific and others no.

In the case of strand specific, for example FR-First strand

What happen if I do a analysis using FR-unstraded and FR-first strand for the same data using top-hat. I should have different numbers of count with HT-seq and in terms of differential expression, for example comparing control vs treatments, the numbers of genes that are significant will change randomly in case of the FR-unstranded.

Thanks

RNA-Seq • 4.0k views

ADD COMMENT • link updated 7.8 years ago by Kevin Blighe 89k • written 7.8 years ago by clizama • 0

score 0 · Answer 1 · 2017-09-13

From my experience, it does not make much difference (if any at all) provided you are just looking to determine raw counts over known transcripts. For HT-seq, which uses a GTF/GFF reference over which abundances are counted from an aligned BAM, you may observe differences at genes where transcription occurs on both strands, like antisense transcripts (e.g. XIST and TSIX). For other programs, like Kallisto, which counts abundances over a FASTA reference transcriptome, I believe that no or absolute minimal differences in count abundances will be observed (and I have tested this).

A good aligner will be able to align all reads, irrespective, and record strand-specific information in the BAM file. This information can then be used by programs like Cufflinks in order to determine strand-specific count abundances, or indeed HT-seq to corroborate strand-specific information in the aligned BAM file with that in the reference GTF/GFF.

I did this recently for a bacterial RNA-seq project where transcription occurred virtually across the entire circular genome on both strands. Had I selected unstranded, in this case, I would observe roughly double count values and half the identified transcripts (and all would be identified on a single strand). The bacterial genome is obviously different from mammalian, though.

I would encourage you to read this great thread: How To Determine If Paired–End Illumina Rnaseq Reads Are Strand–Specific

The issue of strandedness in RNA-seq analysis is one that causes a lot of headaches, so, I'm sure that others have more to add.