How to check if specific mutations are or not enriched in a RNAseq data seq?
1
0
Entering edit mode
6.8 years ago
Lila M ★ 1.2k

Hi everybody, I would like to know if there is any way to analyze if a specific mutation (let's say p53 )is or not enriched in a data set. Is the first time that I try to solve this question but I don't know how to address this issue exactly. I'm reading about GSAR (Gene Set Analysis in R) but not sure if it could cover this.

Any idea?

Thanks!

RNA-Seq enriched mutation • 1.5k views
ADD COMMENT
1
Entering edit mode

Shouldn't you do variant calling to identify mutations? I'm not following the logic in this thread.

ADD REPLY
0
Entering edit mode

GSAR is for gene set networking, PPI, and other analyses relating to differentially expressed, or candidate genes. Can you clarify if you're looking for something at the gene level, or your looking at point mutations within genes?

ADD REPLY
0
Entering edit mode

Hi, I am looking for something at the gene level. In more detail I have two groups and I would like to know if in for example the mutation p53 is enriched or not in any of the groups.

ADD REPLY
2
Entering edit mode
6.8 years ago

If you know which single gene you are testing for then you are in luck as you can apply a much simpler statistical test (no multiple-test correction is needed).

You could apply a "regular" differential expression test but then use the p-value column rather than the adjusted p-value for this gene. Or you could directly apply a binomial test on the observed counts.

ADD COMMENT
0
Entering edit mode

Thank you very much Istvan, What I have is a gene set derived from differential expression analysis (HTSeq) and DEU for two different treatment. I would like to know if p53 mutations is enriched or not in this two different groups. Can you please give me an example (or some literature) about how to do this kind of analysis? Thank you very much

ADD REPLY
1
Entering edit mode

HTSeq produces the counts and I don't know what format is your differential expression is shown as but typically it has columns with average expressions for each condition, a p-value, and an adjusted p-value. In this case, since you are only looking at one gene that you've selected apriori, you can use the p-value column. That p-value indicates the likelihood of observing the change of the reported magnitude by chance alone.

There is no literature that you would need to consult to be allowed to make this choice. The question is whether did the method select the "interesting" gene out of all possible options or did you select the gene beforehand. If a method produces the "interesting" gene by looking at all possible options then you have to correct (adjust the pvalue) for multiple comparisons. Otherwise, you don't.

ADD REPLY
0
Entering edit mode

Thank you for the explanation! I will do it!

ADD REPLY

Login before adding your answer.

Traffic: 1778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6