Question: Analyzing 3' RNA seq library with DESeq2
1
gravatar for roy.granit
5 months ago by
roy.granit800
Israel/LabWorm
roy.granit800 wrote:

I have a 3-prime RNA seq library which I have analyzed using STAR aligner followed by Salmon counter and currently analyzing the data using DEseq2.

It appears that I am getting many counts for some genes.. and I assume this is since I have used a 3' lib in which most reads are concentrated in a small region of the gene.

My question is - should I do something different in the analysis to factor this?

So far in this respect I only told Salmon to avoid length correction

Thanks!

rna-seq deseq2 • 276 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by roy.granit800
2

Probably not. If these genes are known to have high expression values and it's consistent along samples you shouldn't be worried.

ADD REPLYlink written 5 months ago by Asaf6.1k
0
gravatar for kristoffer.vittingseerup
5 months ago by
European Union
kristoffer.vittingseerup2.0k wrote:

It is quite normal that there is a huge span of expression values - especially on read counts - so you might want to switch to length normalized features (just use the TPM from the Salmon quantification). What you can do is calculate the fraction of the total expression which the top expressed genes (e.g. top 5) are responsible for.

ADD COMMENTlink written 5 months ago by kristoffer.vittingseerup2.0k
1

With 3' Seq you normally don't need length normalisation. You get - simply spoken - one read per transcript copy. The reads may be directly in the 3' UTR or are spanning junctions further downstream. In this regard, it's more like tag-counting than full-length RNA sequencing with abundance estimation.

If you have spike-in controls (like the ERCCs), I'd check these for the expected vs. observed expression.

You could also check BioGPS for the highly expressed genes, if they are known to be highly expressed in your given tissue.

ADD REPLYlink written 5 months ago by michael.ante3.3k

Thanks. I suppose DESeq2 will turn to TPM while doing the statistics..?

ADD REPLYlink written 5 months ago by roy.granit800
1

No. DESeq wants raw counts. It will not correct for transcript length. It doesn't need that.

ADD REPLYlink written 5 months ago by swbarnes26.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1716 users visited in the last hour