Dante Labs VCF analysis, CHROM column absence UPD: solved!
0
0
Entering edit mode
5 weeks ago
k.alincha • 0

Hi! Has anyone looked into VCF files provided by Dante Labs? I'm looking into .raw.vcf and .filtered.vcf. For reference sequence, all I can see is 2 contigs: GL000225.1 and GL000192.1. All variants belong to one of these.

There's no conventional CHROM column.

I don't know how do those coordinates map to chromosomes in GRCh37 or GRCh38 assemblies.

GL000192 corresponds to chr1_gl000192_random GL000225 corresponds to chrUn_gl000225

Moreover, there's no VCF header with metainfo, all I can see is 1 top row which is just a row of colon separated numbers that are meaningless to me (nothing like the conventional VCF header).

  1. What's the use of those VCFs?
  2. What variants do they contain?
  3. What version of VCF it is and where can I get the standard metainfo?
  4. Is there a map of those coordinates to any core assembly chromosomes?
  5. Am I right they contain not all genomic VCFs but only a small part of the genome? (Size of GL000225.1 contig is 211,173 bp and size of GL000192.1 is 547,496 bp, that obviously doesn't sum up to the whole genome).

UPD: I found a mistake, sorry. The file was incomplete, only tail was uploaded to Jupyterhub. In local Jupyter, the file loaded completely, it actually has the header and CHROM column.

vcf coordinate dante assembly chromosome • 180 views
ADD COMMENT

Login before adding your answer.

Traffic: 2287 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6