Question: Normalization of RNAseq data and the use of de novo transcriptome
1
gravatar for BioBing
2.1 years ago by
BioBing100
Denmark
BioBing100 wrote:

Hi all,

Hoping that some of the RNAseq experts in here are having some pieces of advice on normalization/proceeding of following data analysis:

The study is about how a stressor is affecting a non-model species (no reference genome/transcriptome available) in terms of differential gene expression. We did a deep sequencing to make a reference transcriptome and sequenced the samples at a "lower depth":

  1. Reference transcriptome de novo assembled (Trinity) from reads with a sequencing depth of 300 M (PE 2x150 nt). The statistics (TrinityStats), E50N90, BUSCO analysis, Blast2Go, Detonate (comparison of 3 assemblies - chose the best one) looks good. The reference is made from a non-stressed individual of the non-model organism.

  2. Triplicate samples of "negative stress control", "positive stress control" and the "treatment" with a sequencing depth of 25M (PE 2x75nt)

How is the best way to use the reference transcriptome in order to determine differential gene expression of the samples? any tips/tricks on how to normalize?

Thank you!

transcriptome denovo rna-seq dge • 1.1k views
ADD COMMENTlink modified 2.0 years ago by theobroma221.1k • written 2.1 years ago by BioBing100
2

I'm no expert, but you could use your reference transcriptome to map reads of your treatments and obtain counts (kallisto, or you can take the single mapping reads as counts I think), however, you will be missing out on all the isoforms specific to that treatment. You can normalize using edgeR's TMM method ( an explanation here), and I am pretty sure the way from there to determine differential expression is pretty standard (maybe look at edgeR's vignettes?).

PS- Is it E50N90 or E90N50?

ADD REPLYlink written 2.1 years ago by biofalconch390
1

Ops, I meant E90N50 :-)

Thank you! I have considered kallisto as well

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by BioBing100
1

The Trinity wiki provides a lot of guidance for exactly what you want to do: https://github.com/trinityrnaseq/trinityrnaseq/wiki/Post-Transcriptome-Assembly-Downstream-Analyses

Since you will be aligning to your transcriptome you will want to rescue multi-mapped reads. The Trinity developers recommend Kallisto, Salmon or RSEM.

As a personal note, my workflow is to map to the assembly using bowtie, estimate abundance using RSEM and then normalization and differential testing using edgeR's TMM method.

ADD REPLYlink written 2.0 years ago by Jake Warner680
2
gravatar for theobroma22
2.0 years ago by
theobroma221.1k
theobroma221.1k wrote:

You can use Rsubread, and this will tie into limma/ edgeR so you can normalize using zoom or limma-trend, depending on the library sizes. Then, test for differential expression. I am also no expert but was able to use this pipeline successfully to do exactly what you are trying to do. Hope this helps.

ADD COMMENTlink written 2.0 years ago by theobroma221.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2413 users visited in the last hour