Help interpreting p value distribution
Entering edit mode
3 months ago

Hello everyone!

I am using Transcriptome Analysis Console (TAC) for DE testing of gene expression data and R for downstream analyses and visualization. I generated volcano plots for three separate datasets (let's call them A, B, and C), and the distribution of p values does not look uniform (there are some gaps as can be seen in the images below). Even in dataset B whose P value distribution looks good, there are no genes with -log10p around 4.

I have never seen this pattern before and I am not sure what to make out of it. What do you think this indicates? Perhaps poor quality of the data?

enter image description here

microarray gene expression differential • 164 views
Entering edit mode
3 months ago

That seems like some sort of systematic error of the processing itself. Perhaps filtering went wrong, or some sort of normalization process kicks in.

The main point is that it is systematic and not a natural error process.

It is hard to see how merely poor quality, noisy data would produce missing p-values in a narrow band alone.


Login before adding your answer.

Traffic: 2171 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6