Question: Differential expression with very imbalanced groups
0
gravatar for Lesdormis
8 months ago by
Lesdormis0
Lesdormis0 wrote:

Hi guys, I have a similar question as this post: https://stackoverflow.com/questions/56840541/differential-expression-with-very-imbalanced-groups

I am trying to perform differential gene expression analysis using public RNA-seq data, which has 775 samples in total. And I would like to do the comparison with 15 samples of interested to the rest samples, which as you can see these two groups are very imbalanced. My boss has suggested that to use a similar method as this paper: https://www.nature.com/articles/nature25171.pdf?origin=ppub To sum up, their method which called "ee-MWW" method, which they subset the bigger group into multiple small sets which have the same sample size as the smaller group and perform the Mann-Whitney-Wilcoxon test on them and get a value which can be ranked and selected the significant genes. And I also tried with DESeq2, but both of their results seem to make no biological meaning to us. They contain a lot of micro RNA genes and pseudo genes.

So does anyone know what is a more correct way to do differential gene expression analysis in the very unbalanced sample sets? Any suggestions and ideas would be very appreciated.

rna-seq R • 223 views
ADD COMMENTlink written 8 months ago by Lesdormis0
1

Does the result make sense if you filter for protein coding genes? Do you know of genes you expect to be differentially expressed and if so how do they behave?

When you have so many samples technical artefacts can be a huge problem. Are you controlling for confounding effects (batch, gender, age etc)?

Could you post a PCA plot of the samples coloured by your groups.

ADD REPLYlink written 8 months ago by kristoffer.vittingseerup3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 893 users visited in the last hour