Question: Represent Precise Deletion In Vcf
2
gravatar for Tomáš Beluský
7.3 years ago by
Brno
Tomáš Beluský90 wrote:

Hi, in my master thesis I have to implement tool for detecting genome variations. Now, I am learning how to represent structural variations in VCF file. But after reading example in http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41 I am little bit confused.

Example says:

#CHROM  POS ID REF ALT QUAL FILTER INFO FORMAT NA00001
1 2827693 . CCGTGGATGCGGGGACCCGCATCCCCTCTCCCTTCACAGCTGAGTGACCCACATCCCCTCTCCCCTCGCA C . PASS SVTYPE=DEL;END=2827680;BKPTID=Pindel_LCS_D1099159;HOMLEN=1;HOMSEQ=C;SVLEN=-66 GT:GQ 1/1:13.9

Length of reference sequence is 70bp (69bp if we remove first base which doesn't belong to deletion), but SVLEN here is 66bp and also if we subtract END with POS, we get -13. Shouldn't SVLEN be 69 and END position 2827762? Or maybe I'm missing someting here.

Thanks for clarifying me this.

vcf 1000genomes • 3.0k views
ADD COMMENTlink modified 7.3 years ago by bhandsaker40 • written 7.3 years ago by Tomáš Beluský90
1

Congratulations on having the same master thesis project as me! :-)

ADD REPLYlink written 7.3 years ago by PoGibas4.8k
3
gravatar for bhandsaker
7.3 years ago by
bhandsaker40
bhandsaker40 wrote:

Yes, I think this is definitely buggy. I believe this record is supposed to represent rs2376870 (1000 Genomes ID P1M0615101909), which on hg18 should look like:

1 2827694 P1_M_061510_1_909 CGTGGATGCGGGGACCCGCATCCCCTCTCCCTTCACAGCTGAGTGACCCACATCCCCTCTCCCCTCGCA C . . BKPTID=Pindel_LCS_D1099159;END=2827762;HOMLEN=1;HOMSEQ=G;SVLEN=-68;SVTYPE=DEL

I amended the spec, versions 4.1 and 4.2 (draft). I also fixed the typo on the next line where SVLEN should have been -205, not -105.

ADD COMMENTlink modified 7.3 years ago by Istvan Albert ♦♦ 84k • written 7.3 years ago by bhandsaker40

Can we move the docs to a public wiki?

ADD REPLYlink written 7.3 years ago by Dan520
2
gravatar for zam.iqbal.genome
7.3 years ago by
United Kingdom
zam.iqbal.genome1.7k wrote:

SVLEN is defined as length(alt) - length(ref), so looks to me like it should be 69, and that is a typo.

ADD COMMENTlink written 7.3 years ago by zam.iqbal.genome1.7k
1
gravatar for Adam
7.3 years ago by
Adam990
United States
Adam990 wrote:

Probably best to ask on the vcf specification mailing list (vcftools-spec@lists.sourceforge.net), but I agree that it's probably a typo.

ADD COMMENTlink written 7.3 years ago by Adam990
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2091 users visited in the last hour