Genotype representation for male chromosome X and Y
1
0
Entering edit mode
4.6 years ago

Hi friends,

My question might be very trivial to others. I am working on VCF files. Recently I ended up in a confusion for the genotype representation in male X and Y chromosomes.

Human is a diploid organism, which is a well known fact. So we have following different genotypes for all autosomal chromosomes.

1) 0/0 - First allele is a reference base and second allele is a reference base (two alleles are present in two chromosomes)

2) 0/1 - First allele is a reference base and second allele is a alternate base (1 chr has ref base and its pair has alt base)

3) 1/1 - First allele is an alternate base and second allele is an alternate base (1 chr has alt base & its pair has alt base)

4) 1/2, 1/3, 1/4....so on

For male, chrX and chrY should have haploid calls. Then the genotype should be - GT : 0 instead of 0/0

• GT : 1 instead of 1/1

• GT : 2 instead of 1/2

But, why the VCFs are showing 0/1,1/1,1/2..etc similar to autosomal chromosomes?

snps dnaseq vcf genotype • 3.3k views
0
Entering edit mode
4.6 years ago

Variant callers don't know that you're sequencing an organism with haploid X/Y chromosomes in some but not all of your samples, so they treat everything as diploid. I've heard of many people skipping at least chromosome Y for this reason.

1
Entering edit mode

That's not true of all variant callers. The RTG variant caller is sex-aware and will produce haploid GT where appropriate, according to the sex of the individuals as specified (including producing diploid calls for male within PAR regions).

RTG also includes a chrstats command which will help identify the sex for those samples where the sex is unknown.

0
Entering edit mode

Good to know, thanks for the info!

0
Entering edit mode

Thanks, Devon. I have some samples for which I don't have the gender information.

I read somewhere that for male, there should be many mutations in chr Y and majority of the mutations in chr X should be homozygous alternate. Why it cant be heterozygous genotype?

Similarly, for female, they don't have Y chromosome. So there shouldn't be any mutations and mutations in chrX can be heterozygous and homozygous alternate.

Sorry, I am from computer science background. Can you explain it?

0
Entering edit mode

You should be able to tell just from chromosome X. The ones with higher numbers of heterozygous variants are female. For why, think of many each of X and Y males and females have.

1
Entering edit mode

Since OP is from a computer science background I would like point to the pseudo-autosomal regions on X/Y chromosomes: https://en.wikipedia.org/wiki/Pseudoautosomal_region