Question: Identify Germline Mutations
yliuboston20 wrote:

Hi everyone,

I am now trying to identify germline mutations from tumor and matched normal samples in TCGA and also I need to know this germline mutation occur in exactly which samples. I know some software like VarScan can do this, but they are all working on .bam file, I only have some .vcf files, any software or protocol can do the following? 1. Identify germline mutation in a group of tumor and matched normal. 2. For each of germline mutations identified, tell which sample they call.

Thanks in advance!

1) VarScan actually uses .pileup (it doesn't directly read .bam file)

2) If you already have a list of variants in a .vcf file, I don't really think you can call somatic mutations by doing anything beyond comparing overlap between two files (which is something I would do using a custom Perl script). If at all possible, I would try to find a more raw form of data and use standard tools like VarScan, MuTect, SomaticSniper, etc.

It smells like a homework problem. He has two vcf and just needs to subtract.

Thanks a lot for your explanation. I think I have a .vcf file that have compared one tumor and its matched normal, so I guess the "subtraction" job has been done. But I am not quite sure among all those variants called, which germline mutations that I should be confirmed to use in the following analysis, like filter=pass, or based on DP bigger than a threshold? If I donot have access to a more raw form of data, is there a standard way to do this? Thanks in advance!

