Question: Discrepancy in differentially expressed genes p-adj
0
gravatar for saddamhusain77
15 months ago by
saddamhusain770 wrote:

Hi, I analysed RNA seq data recently using UCSC knowngene as annotation file and found several differentially expressed genes. Then I confirm DEGs (P-adj<= 0.05) by real-time PCR and found consistency between rna-seq data and real-time PCR data. But when I analysed same data using same analysis pipeline but with GENCODE annotation files, I found that genes those were down with UCSC knowngene annotation are still down (except for some) but the p-adj is > 0.05 (for one of the gene it is 0.22). I want to understand why there is discrepancy between two analyses and which p-adj values shall I consider?

Any suggestion would be appreciated.

rna-seq p-adj cutoff • 413 views
ADD COMMENTlink modified 15 months ago • written 15 months ago by saddamhusain770

Could you state how many features were studied in the two different RNA-Seq studies, please

ADD REPLYlink written 15 months ago by russhh5.4k

~76000 in UCSC knowngene and 206694 in GENECODE if that is what you asked.

ADD REPLYlink written 15 months ago by saddamhusain770
1

More features lead to more elements to consider during multiple testing correction.

ADD REPLYlink written 15 months ago by ATpoint36k

Thank you for the responses, even if the multiple testing is the source of discrepancy, the analysis in the later case is underestimating the true positive results. In such a case what should be the appropriate approach to detect all the true positives? I mean, had I have done the analysis with GENCODE annotation only I would be missing on several targets.

ADD REPLYlink written 15 months ago by saddamhusain770

Thank you for the response.

ADD REPLYlink written 15 months ago by saddamhusain770

If you did qRT-PCR, I'd probably just use the p-values from that, given that it's a more direct method. Discrepancy could be from many things - isoforms that one annotation set contains that the other doesn't being a big one if you're doing gene-level analysis. Plus GENCODE has many more lncRNAs and such annotated, which affects the normalization and significance testing adjustment in a not insignificant way. Generally, it's a good idea to stick with a given set of annotations throughout a project, otherwise you will run into issues such as this.

ADD REPLYlink modified 15 months ago • written 15 months ago by jared.andrews076.2k

Thank you for the responses, even if the multiple testing is the source of discrepancy, the analysis in the later case is underestimating the true positive results. In such a case what should be the appropriate approach to detect all the true positives? I mean, had I have done the analysis with GENCODE annotation only I would be missing on several targets.

ADD REPLYlink modified 15 months ago • written 15 months ago by saddamhusain770
1

In order to detect all true positives, you will have let go of the p-value, fully embracing the presence of false positives.

There is not binary answer here -- if you had performed the analysis with GENCODE, you'd probably just picked other candidates for follow-up tests via qPCR. For a good primer on how the choice of annotation will influence the RNA-seq analysis outcome, you could read Zhao & Zhang. 2015

ADD REPLYlink written 15 months ago by Friederike5.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1077 users visited in the last hour