Question: what logFC or adj-p value cutoff should be chosen when the number of samples in one of the conditions is too small?
gravatar for nazaninhoseinkhan
22 months ago by
Iran, Islamic Republic Of
nazaninhoseinkhan400 wrote:

Dear all,

I run DESeq2 program on 52 tumor Vs only 3 normal samples.

Applying >=0.05 cutoff on adjusted p values and |logFC>=1| was resulted to 1450 up-regulated Vs 440 down-regulated genes.

Now my question is:" is this large numbers of de-regulated genes has been caused by the very small number of normal samples?"

How can I tackle the problem of the small size of normal samples? Is it reasonable to apply more stringent cutoff on logFC or adjusted p-values?

Is it acceptable if I work on these large number of deregulated genes and report them?

I am looking forward to your comments


ADD COMMENTlink modified 22 months ago by Kevin Blighe60k • written 22 months ago by nazaninhoseinkhan400
gravatar for Kevin Blighe
22 months ago by
Kevin Blighe60k
Kevin Blighe60k wrote:

For a tumour versus normal comparison, I think that one should expect a large proportion of the transcriptome to be differentially expressed. Your sample numbers are hugely imbalanced, though, which is a limitation of your study.

You can reasonably adjust your thresholds for statistical significance. In fact, I would recommend to use |log2FC|>=2 and adjusted P<=0.01. Basically, you are the analyst here and you should adjust the thresholds to suit the downstream analyses that you (or your collaborators) intend to perform.

It would be interesting to see how you normalised the data and conducted the differential expression analysis. Note that, in the recent version of DESeq2, it is recommended to perform lfcShrink separately and to not use betaPriors:

dds <- DESeq(dds, betaPrior=FALSE)

res <- results(dds, contrast=c("Tissue", "Tumour", "Normal"), independentFiltering=TRUE, alpha=0.01, pAdjustMethod="BH", parallel=TRUE)

res <- lfcShrink(dds, contrast=c("Tissue", "Tumour", "Normal"), res=res)


ADD COMMENTlink written 22 months ago by Kevin Blighe60k

Hi Kevin,

Yes, I have run lfcShrink function. However, when I checked the results I strangely saw no down-regulated genes were detected.

So I preferred to use the results of dds <- DESeq(dds) instead. As you suggested to me I used |log2FC|>=2 and adjusted P<=0.01, however, no down-regulated genes was detected.

I have another question. I am running DESEq2 on different races. The sample size of tumor Vs normal is very different in distinct races. Should I use the same cutoff (adj p value and logFC) for all races? I want to compare the results between different races.

Thank you so much


ADD REPLYlink written 22 months ago by nazaninhoseinkhan400

You only have 3 normals, though? How many races are in your dataset?

Note that I published recently on this topic: Racial differences in endometrial cancer molecular portraits in The Cancer Genome Atlas.

ADD REPLYlink written 22 months ago by Kevin Blighe60k

I am trying to analyze 3 races: Asian(50T, 3N), white(330T,50N) and black\african-american( 28T, 4N).

However, I have analyzed not reported groups, but I am not sure if I include them in the analysis.

And thank u for the paper. I will read it as soon as possible.

ADD REPLYlink written 22 months ago by nazaninhoseinkhan400

If you want to explore differences between the different races, then you could normalise all samples together and include race + tissue in your design formula.

ADD REPLYlink modified 22 months ago • written 22 months ago by Kevin Blighe60k

As you suggested to me, at first I wanted to normalize all samples together, however, I had to run the analysis on my laptop I run the analysis separately for each race. I will try to repeat the analysis by normalizing all samples together and compare the results.

ADD REPLYlink written 22 months ago by nazaninhoseinkhan400
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1626 users visited in the last hour