I am having a bit of hard time analyzing somatic mutations for downstream analysis.
I found some somatic mutations and organized them into a group for my analysis. And then, I tried to see whether they are related to cancer or not using many sequencing filters and public databases. In this process, I have questions for you about the dbSNP.
I know that dbSNP have a list of mutations with population frequencies that are reported in only healthy-person. Suppose that I have mutation at chr1:1000 and I think this mutation is dominant and cancer-related by the fact that this mutation was also found at the sample position in other samples(all the samples have the same cancer-type. Therefore I regard the mutations detected across samples as recurrent mutations not as artifacts) and they passed all the filter I used. And then, I used the dbSNP databases to see if they have my detected mutation and ,finally, I identified the same mutations in dbSNP at the same positions with same base changes.
In this situations, my questions is as follows :
1) Can I ignore this mutations by the fact that they already existed in dbSNP?
2) Are there any other evidences that I am going check more in dbSNP such as allele frequencies, population levels before removing it?
3) What if my detected mutations are also both in COSMIC databases and dbSNP? How can I treat them for my downstream analysis?
(Assumption is that cosmic reports the cancer mutations while dbSNP has variants that found in healthy-person.)