Question

Inferring zygosity information from MAF

1

Entering edit mode

9.8 years ago

newDNASeqer ▴ 760

I've heard that we can't infer heterozygosity or homozygosity for the mutations identified in MAF files, but I noticed that the MAF file's columns 11-13 may have information for inferring this info. For example

Reference_Allele     Tumor_Seq_Allele1     Tumor_Seq_Allele2
G                    G                     C
G                    G                     C
A                    A                     C
G                    G                     A
C                    C                     T
C                    C                     A

From a comparison of Ref and Tumor alleles, can we infer that all these mutations are heterozygous? because One tumor allele is the same as Ref allele. Is this understanding correct?

MAF • 4.1k views

ADD COMMENT • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by newDNASeqer ▴ 760

Ram · Answer 1 · 2014-08-04

1

Entering edit mode

9.7 years ago

Cyriac Kandoth 6.0k

In an ideal world, yes. These fields are meant to represent zygosity. But for TCGA MAFs, as explained in this comment: It's often assumed that somatic point mutations or small indels in cancer are infrequent enough, that they almost always result in a homozygous site becoming heterozygous. Combine that imperfect assumption with someone's good old fashioned indifference, and you get MAF columns 12 and 13.

A more reliable way to infer zygosity would be to look at REF/ALT allele counts or fractions. There is no standard column that stores these numbers in TCGA MAFs, but these are some of the various column headers I have seen used: t_alt_count, t_ref_count, NVarCov, NTotCov, tumors_var_reads, tumor_ref_reads.

ADD COMMENT • link updated 4.5 years ago by Ram 43k • written 9.7 years ago by Cyriac Kandoth 6.0k

0

Entering edit mode

I am still veray confused .so what's the actual meaning of Tumor-Seq-Allele1 and Tumor-Seq-Aelle2

ADD REPLY • link 4.4 years ago by 1378917721 • 0

0

Entering edit mode

The two alleles observed in the tumor sequencing data which MAF format assumes is diploid.

ADD REPLY • link 4.3 years ago by Cyriac Kandoth 6.0k