Question: DGE analysis using stranded and unstranded RNA-seq libraries.
0
gravatar for Sentinel156
3.0 years ago by
Sentinel156120
Melbourne, Australia
Sentinel156120 wrote:

Hi all,

I am working with Illumina HiSeq 2000 100bp single end RNA-seq data. Some of my samples originate from unstranded libraries and some from stranded libraries. I'm trying to understand the best way to do read summarisation for these libraries using featurecounts for eventual DGE analysis. To date I have treated all datasets as unstranded for mapping (tophat) and counting (featurecounts).

However I am fearful that read counts for my unstranded libraries will be biased for genes which have antisense transcripts (since reads originating from the antisense transcript will be fused into the counts for the gene on the sense strand in positions that the two features overlap). So what is the recommended course of action here? I'm not interested in antisense transcripts so should i continue to treat everything as unstranded for the featurecounts run? I have seen some other threads here that suggest incorporating strandedness into the DGE calculation as a multi-factorial design but was hoping for a more thorough explanation of how this is the better workaround for this problem.

Thank you in advance.

rna-seq • 1.6k views
ADD COMMENTlink modified 3.0 years ago by Devon Ryan88k • written 3.0 years ago by Sentinel156120
6
gravatar for Devon Ryan
3.0 years ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

Do you really think mapping stranded libraries as if they're unstranded and then doing the counted in an unstranded fashion gets rid of all possible bias? I expect not. That's why you'll see everyone suggesting to align each sample as appropriate (stranded or not, depending on the sample), doing the counting as appropriate (stranded or not, depending on the sample), and then adding a batch effect into the model (with an interaction term if you're really concerned, have a look at a PCA plot).

ADD COMMENTlink written 3.0 years ago by Devon Ryan88k
3

+1 for your answer. By the way, I think the interaction term [batch:condition] is really needed here since antisense transcripts usually have opposite expression dynamics than their sense counterparts. Meaning that, in a condition, if a gene is overexpressed, there is a good chance that its antisense will be underexpressed. So the batch effect is expected to vary accross conditions, especially for the genes you are interested in, i.e, those who are differentially expressed accross conditions.

ADD REPLYlink modified 2.5 years ago • written 3.0 years ago by Carlo Yague4.4k

Hi, I do not think these statements are true "since antisense transcripts usually have opposite expression dynamics than their sense counterparts" and "in a condition, if a gene is overexpressed, there is a good chance that its antisense will be underexpressed". If you are talking about natural antisense transcripts (NATs) or non-coding antisense, it is not a general phenomenon where you always find anti-correlative expression. Because these expression concordance between sense and antisense is context dependent (tissue or cell type etc.,).

Examples:

The landscape of antisense gene expression in human cancers

A cautionary tale of sense-antisense gene pairs: independent regulation despite inverse correlation of expression

Genome-wide Identification and Characterization of Natural Antisense Transcripts

Genome-wide analysis of expression modes and DNA methylation status at sense‚Äďantisense transcript loci in mouse

Sense-Antisense lncRNA Pair Encoded by Locus 6p22.3 Determines Neuroblastoma Susceptibility

Conserved expression of natural antisense transcripts in mammals.

This is not a answer to the main question rather it is reply for the statement made in this post.

ADD REPLYlink modified 9 months ago • written 9 months ago by EagleEye6.2k

Well the situation is perhaps more complex in higher eukaryotes, but I think that in simpler systems, the anti-correlation between sense and anti-sense transcription is rather well established. There is for instance this recent paper:

Native elongating transcript sequencing reveals global anti-correlation between sense and antisense nascent transcription in fission yeast.

ADD REPLYlink written 9 months ago by Carlo Yague4.4k

Hi,

My point was, there are evidence for both positive and negative correlation with good publications. So there is no general rule that sense and antisense are globally anti-correlated or positively correlated. There are many factors contributing to that (some times it is species dependent too).

Sorry one more reference, Antisense Transcription in the Mammalian Transcriptome

ADD REPLYlink written 9 months ago by EagleEye6.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1075 users visited in the last hour