differential gene expression analysis using log2(counts+1)
0
1
Entering edit mode
2.5 years ago
elb ▴ 200

Hi guys, I know that DESeq as well as EdgeR perform differential gene expression analysis from RNA seq experiments using counts. I have normalized my data.frame using an independent method (qsmooth not in the pipeline of the two packages). The input/output of the normalisation are log2(counts). Now, I would like to perform differential gene expression analysis between the conditions of my dataset using quasi-likelihood negative binomial method. I know that the function voom of Limma allows the transformation of counts to log2-counts per million for subsequent linear model. However it is not clear if this is the case in my analytical condition. Can anyone please explain/suggest me how to proceed for differential expression analysis using my log2 normalized counts?

Thank you a lot!

rna-seq R • 1.6k views
1
Entering edit mode

I changed the post to Question and also removed the Job tag, as this is no job offer.

0
Entering edit mode

Oh! Sorry! Thank you very much!

0
Entering edit mode

No problem :)

0
Entering edit mode

Ciao, can you plot the distribution of your data via the hist() function? How does it appear?

qsmooth is relatively new. Do the authors not indicate how best to perform differential expression comparisons with the data?

0
Entering edit mode

Ciao! Of course! I will edit my question!!

0
Entering edit mode

Looks like there are many variables (genes) of low counts in the data. Did you do any pre-filtering for low counts prior to normalisation?

I am not yet familiar with qsmooth. Just wondering why you would not directly use DESeq2, LImma / Voom, or EdgeR, which can both normalise your data and perform differential expression analysis? I looked at the manuscript for qsmooth, published this year, and the authors have not mentioned which test is more suitable for qsmooth, in terms of differential expression analysis.

0
Entering edit mode

Dear Dott. Blighe, not to complicate my life but I used that package simply because of the group dependent distribution of expression values. Moreover the genes we are interested in, are poorly expressed and so I think the canonical normalisation could flatten the differences because of the mean "smoothing" they perform.

0
Entering edit mode

Poi, you are saying that the low count genes in the histogram are the main genes of interest? Can you provide some background about the experiment?

0
Entering edit mode

See How to add images to a Biostars post to add your images properly. You need the direct link to the image, not the link to the webpage that has the image embedded (which is what you have used here)