Discrepancy in differentially expressed genes p-adj
0
0
Entering edit mode
5.0 years ago

Hi, I analysed RNA seq data recently using UCSC knowngene as annotation file and found several differentially expressed genes. Then I confirm DEGs (P-adj<= 0.05) by real-time PCR and found consistency between rna-seq data and real-time PCR data. But when I analysed same data using same analysis pipeline but with GENCODE annotation files, I found that genes those were down with UCSC knowngene annotation are still down (except for some) but the p-adj is > 0.05 (for one of the gene it is 0.22). I want to understand why there is discrepancy between two analyses and which p-adj values shall I consider?

Any suggestion would be appreciated.

RNA-Seq rna-seq p-adj cutoff • 1.5k views
ADD COMMENT
0
Entering edit mode

Could you state how many features were studied in the two different RNA-Seq studies, please

ADD REPLY
0
Entering edit mode

~76000 in UCSC knowngene and 206694 in GENECODE if that is what you asked.

ADD REPLY
1
Entering edit mode

More features lead to more elements to consider during multiple testing correction.

ADD REPLY
0
Entering edit mode

Thank you for the responses, even if the multiple testing is the source of discrepancy, the analysis in the later case is underestimating the true positive results. In such a case what should be the appropriate approach to detect all the true positives? I mean, had I have done the analysis with GENCODE annotation only I would be missing on several targets.

ADD REPLY
0
Entering edit mode

Thank you for the response.

ADD REPLY
0
Entering edit mode

If you did qRT-PCR, I'd probably just use the p-values from that, given that it's a more direct method. Discrepancy could be from many things - isoforms that one annotation set contains that the other doesn't being a big one if you're doing gene-level analysis. Plus GENCODE has many more lncRNAs and such annotated, which affects the normalization and significance testing adjustment in a not insignificant way. Generally, it's a good idea to stick with a given set of annotations throughout a project, otherwise you will run into issues such as this.

ADD REPLY
0
Entering edit mode

Thank you for the responses, even if the multiple testing is the source of discrepancy, the analysis in the later case is underestimating the true positive results. In such a case what should be the appropriate approach to detect all the true positives? I mean, had I have done the analysis with GENCODE annotation only I would be missing on several targets.

ADD REPLY
1
Entering edit mode

In order to detect all true positives, you will have let go of the p-value, fully embracing the presence of false positives.

There is not binary answer here -- if you had performed the analysis with GENCODE, you'd probably just picked other candidates for follow-up tests via qPCR. For a good primer on how the choice of annotation will influence the RNA-seq analysis outcome, you could read Zhao & Zhang. 2015

ADD REPLY

Login before adding your answer.

Traffic: 2701 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6