Why is NA18498 sample missing from 1000 genome GRCh38 vcf file?
1
1
Entering edit mode
12 months ago
octpus616 ▴ 100

Hi I am trying to compare the 1000 genome phase3 GRCh19 vcf file and the GRCh38 (low coverage) vcf file.

I noticed that one sample, NA18498, is present in the GRCh19 file but not in the GRCh38 file. I could not find a document and any other source that confirms this information. Does anyone know why this sample is missing from the GRCh38 vcf file and where I can find more details about it?

I have also noticed that there are 45 more samples in the GRCh38 version file, such as HG00270, HG03398, HG03393. I am not sure why they were introduced

Thank you for your help.

Regrads

Zhang

Resource:

GRCh19 1000 genomes: http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/

GRch38 1000 genomes (low coverage): http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/release/20190312_biallelic_SNV_and_INDEL/

NGS vcf 1000 1kg genome • 547 views
ADD COMMENT
0
Entering edit mode
12 months ago
DBScan ▴ 300

If I am not mistaken, the GRCh38 files you relate to only contain unrelated samples, that's why NA18498 is not included.

ADD COMMENT
0
Entering edit mode

yeah, but GRCh19 seems also only contain unrelated sample, the NA18498 also including in GRCh19 and GRCh38 1000 genomes (high coverage) unrelated panel

ADD REPLY

Login before adding your answer.

Traffic: 2109 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6