Question: Missing variants in hg38 lift-over of 1000-genomes data
9 months ago by
gokberk wrote:

Hi everyone,

I've been looking in the hg38 mapped version of chromosome 12 from 1000 genomes (phase 3) data. Curiously, at certain parts (longer than several hundred kb) of this lift-over version, there are not any SNPs. The region I'm interested in is chromosome 12: 7,500,000-8,000,000. When I run the following command, I simply don't get any SNPs, but the header:

tabix -h ALL.chr12_GRCh38.genotypes.20170504.vcf.gz 12:7500000-8000000

Whereas, when I use the same command for most other parts of the vcf, I can get the SNP list normally.

So, I was wondering if it's somehow a known issue or am I doing something wrong as usual.

Any help is much appreciated.

Cheers, Gökberk

1000genomes vcf
written 9 months ago by gokberk60
9 months ago by
United States
genomax wrote:

Direct hg38 calls are available here. Have you checked into those?

written 9 months ago by genomax

Oh, didn't know that direct hg38 calls were available. It looks like they have all SNPs that were missing in the lift-over indeed. Thanks a lot genomax!

written 9 months ago by gokberk60
