This is very possibly a layman question …..
I have a MAF file with sequencing data for lymphoma specimens. I have no data regarding the tumor purity of the samples. There are no matched normal samples. Germline mutations have been filtered out based on mapping to the 1000 genome. Each mutation is mapped to COSMIC and tagged as pathogenic, likely pathogenic or unknown.
I have read about the various tools for estimating sample purity from the sequencing data (e.g., CNVkit, THetA2, FACETS etc.).
However, I was wondering if there is an approach that uses the fact that the COSMIC mapped somatic mutations are supposed to be unique to the tumor cells in order to estimate the tumor purity and to normalize the values of the allele frequencies.