DESeq sample organization
2
0
Entering edit mode
9.4 years ago

I am studying some transcript differential expression between Cancer and Adjacent tissues. My sample is organized as follows:

            sample  type
Barcode_01  01      NT
Barcode_03  02      GC
Barcode_04  02      AD
Barcode_05  03      GC
Barcode_06  03      AD
Barcode_07  04      AD
Barcode_08  04      GC
Barcode_09  05      AD
Barcode_10  05      GC

where GC is gastric tissue and AD the adjacent tissue. I also have one non-cancerous sample that I wish to compare. Thus I need to compare:

AD x GC (where I need to account for in sample variation)

NT x AD

NT x GC

However on loading my data to DESeq, it returns the following error:

raw <- DESeqDataSetFromMatrix(count, sample.data, ~ type + sample)

Erro em DESeqDataSet(se, design = design, ignoreRank) :
  the model matrix is not full rank, so the model cannot be fit as specified.
  one or more variables or interaction terms in the design formula
  are linear combinations of the others and must be removed

Is there some way to organize my data in order to account for in sample variation in my comparison?

software-error DESeq RNA-Seq • 3.5k views
ADD COMMENT
4
Entering edit mode
9.4 years ago
Michael Love ★ 2.6k

Devon is right, this analysis is complicated by the fact that sample 1 and NT are confounded, so there's no way to model both effects.

There is a way to hack the column data to fit a model which controls for the sample differences in the GC and AD samples. Make sure that NT is the base level of type (see vignette). Add a column to the column data which is sample.nested = factor(c(1,1,1, 2,2, 3,3, 4,4)). Then use a design of ~ sample.nested + type., and use: DESeq(dds, modelMatrixType="standard"). The AD vs GC results table is straightforward (use 'contrast'), however the ones involving NT are a bit more complicated. If you were to ask for a simple contrast, results(dds, contrast=c("type","AD","NT")), this would only give the comparison within samples 1 and 2. You have to add 1/4 of the effects from the sample.nested terms in resultsNames(dds). So the numeric contrast should be results(dds, contrast=c(0,1/4,1/4,1/4,1,0)) for the AD vs NT comparisons for example.

ADD COMMENT
0
Entering edit mode

I 've tried to perform this analysis on two steps.

First I loaded the data normal using ~ type design and perform AD x NT compare and GC x NT .

Later I loaded the data excluding the first sample and using the design ~type + sample and compared AD x NT. is this wrong? or I should use your trick.

ADD REPLY
2
Entering edit mode
9.4 years ago

You need to remove sample 1. A model with both it and the NT type can't be fit, since you can't discriminate between the sample 1 effect and the NT effect.

ADD COMMENT

Login before adding your answer.

Traffic: 2034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6