How to get p-value from differential analysis involving normal and tumor samples?
2
2
Entering edit mode
5.3 years ago

Hello Everyone,

I have several samples of some types of biological data, such as: mRNA , miRNA and DNA Methylation where each of them has two conditions : normal and tumor.

I would like to implement the differential analysis between the two conditions of the entire samples in order to obtain the p- value for each gene.

I thought of using DESeq, but I could not because my data is already normalized . For example in mRNA samples, the data are normalized and applied the log 2, as you can see below:

Gene N1 T1 N2 T2
ARHGEF10L 3.3151314 3.2328449 3.2583983 3.4465871
HIF3A 3.0830942 1.9722883 3.2255372 1.5074648
RNF17 -0.7374466 -1.6201573 -1.3785693 -4.2487934
RNF10 3.5662794 3.5837116 3.5824115 NA

From what I read in the DESeq documentation , It needs a table containing to reads count , I have not it. So I wonder if there is some way to get some statistical analysis that makes the difference between the two conditions ( normal and tumor) of all samples in order to obtain the p-value of genes ?

The final result that I would like to obtain is like this:

Gene p-value
ARHGEF10L 0.2342
HIF3A 0.676
RNF17 0.892
RNF10 0.1243

Please, can anyone give me a suggestion to resolve it in python?

Thank you very much for all the attention!

differential analysis p-value python • 1.3k views
0
Entering edit mode

python, perl, R, whatever, you can do a t-test for each gene to get the p-value, and then you need to adjust your p-values for multiple comparisons with some method such as FDR.

In R, it's straightforward and simple:

d <-read.delim("mydatafile.txt");
pvals <- apply(d, 1, function(x)
{
rst<-try(
t.test(as.numeric(c(x[2], x[4])),
as.numeric(c(x[3], x[5]))),
silent=T);
if(is(rst, "try-error"))
return(NA)
else
return(rst\$p.value);
}
);

FDRs <- p.adjust(pvals,  method="fdr", n=length(pval));


Hope this helps.

0
Entering edit mode

Once you get the p-values for your data, you may want to have a look what Open Targets has got in its Platform e.g. the differential RNA expression as one of the pieces of Evidence for HIF3A in cancer.

2
Entering edit mode
5.3 years ago

limma is expecting normalized log2 transformed data, so you can put things in there. We've had some luck using it for methylation data as well, though do note that you should do the stats on the logit-transformed data.

0
Entering edit mode
5.3 years ago

Hi Devon!

Thank you very much for your ideas! I implemented t-test and it worked very well! :)