Question: How to normalize GTEX gene counts with DESeq2?
gravatar for kakukeshi
6 months ago by
kakukeshi50 wrote:


I want to normalize the gene counts from GTEX using variance stabilizing transformation (VST) but I'm confused about which variables I should include in the "design" when creating DESeqDataSet. For the moment I'm doing the following:

dds <- DESeqDataSetFromMatrix(countData = gtex,
                              colData = sampledata,
                              design = ~ tissue) #generate the deseq data set

dds <- dds[ rowSums(counts(dds)) > 1, ] #remove genes with zero counts

vsd <- vst(dds, blind = FALSE) #normalization considering tissue

However, this just considers the different tissues during the normalization. My question is should I do it like this and include all the tissues or do it for each tissue and use something like ~ 1? should I include other variables like the experimental batch or Post-mortem interval (PMI)?

Many thanks

rna-seq • 302 views
ADD COMMENTlink modified 6 months ago by Biostar ♦♦ 20 • written 6 months ago by kakukeshi50

When you set blind = TRUE, I'm pretty sure you are not considering the different tissues.

ADD REPLYlink written 6 months ago by swbarnes26.7k

oops! its corrected now

ADD REPLYlink written 6 months ago by kakukeshi50
gravatar for Kevin Blighe
6 months ago by
Kevin Blighe49k
Kevin Blighe49k wrote:

What you use as the design formula will depend, in part, on your end goals: are you aiming to perform differential expression analysis (DEA) across the GTEx tissues or do you just want to normalise and transform the data for other downstream tools? For DEA, obviously you have to include your condition of interest in the formula.

In the past, I input GTEx raw count data for just a single cancer type to DESeq2 but was not interested in any DEA. I therefore just used the intercept-only formula: ~ 1.

You could include tissue, if you wish, and also other factors that you believe may bias the counts. With blind = FALSE for rlog() or vst(), as swbarnes2 implies, the transformation will then 'see' the design formula and some adjustment based on this will be made when the transformation is made.

ADD COMMENTlink modified 6 months ago • written 6 months ago by Kevin Blighe49k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1319 users visited in the last hour