I am now analyzing TCGA data and have been approved for data access. I am interested in germline mutation from whole genome sequencing data and could download vcf files by individual sample.
But there were only several hundreds of variants in one vcf file per sample (smaller number than I expected). These are the usual numbers of variants from WGS per sample?
And the second question is,, Since these are vcf files, how can I know the genotype of a certain marker? If one person has variant allele on that marker, there should be a line of that marker in his vcf file. But if he has only reference allele on that marker, his vcf file would not contain that marker. So how can I know "no marker" means "there's no variant" OR "that marker was called due to low quality"?
Any comment would help. Thank you!