Question: Comparing variants in vcf files that have been mapped to GRCh37 and GRCh38
1
gravatar for cl10101
2.3 years ago by
cl1010180
cl1010180 wrote:

I have a VCF format file, which contains variants found for file mapped to hg38 (reference from GATK hg38 bundle) and I would like to compare this variants with VCF file from 1000 genome project, which is mapped to GRCh37. By comparing this files I mean finding variants shared by both files. Are the coordinates in this files the same or should I somehow convert them?

hg19 hg38 vcf • 1.6k views
ADD COMMENTlink modified 2.3 years ago by Pierre Lindenbaum121k • written 2.3 years ago by cl1010180
0
gravatar for Pierre Lindenbaum
2.3 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum121k wrote:

convert the file with picard LiftoverVcf : https://broadinstitute.github.io/picard/command-line-overview.html#LiftoverVcf then compare How to compare 2 VCF files

ADD COMMENTlink written 2.3 years ago by Pierre Lindenbaum121k

Thank you. Unfortunately I can't find liftover chain file for g1k_v37 to hg38. I was trying to use hg19ToHg38.over.chain from http://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/ but output file is empty and all records were rejected. I found out that g1k_v37 is GRCh37 with slight differences.

ADD REPLYlink written 2.3 years ago by cl1010180

hg19=v37 .

you're looking for http://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/hg19ToHg38.over.chain.gz if needed convert the chromosomes names by removing the chr prefix .

ADD REPLYlink written 2.3 years ago by Pierre Lindenbaum121k

Thank you for response. I removed char prefix in hg19ToHg38.over.chain file as you suggested and now I have error:

"Exception in thread "main" htsjdk.tribble.TribbleException: Badly formed variant context at location chr1:789016; getEnd() was 789016 but this VariantContext contains an END key with value 724396"

I would be grateful for any suggestion how to solve this problem.

ADD REPLYlink written 2.3 years ago by cl1010180

show me

"grep -Fw 789016 your.vcf"
ADD REPLYlink written 2.3 years ago by Pierre Lindenbaum121k

there is no output for 789016

ADD REPLYlink written 2.3 years ago by cl1010180

try "grep -Fw 724396 your.vcf"

I got the same problem:

Exception in thread "main" htsjdk.tribble.TribbleException: Badly formed variant context at location 13:32707645; getEnd() was 32707645 but this VariantContext contains an END key with value 33281782

and the corresponding record is:

13  33281782    esv3631727  C   <INS:MT>    100 PASS    AC=154;AF=0.0307508;AN=5008;CIEND=-1,0;CIPOS=0,1;CS=NUMT_umich;END=33281782;NS=2504;SVTYPE=INS;IMPRECISE;DP=15932;EAS_AF=0.0129;AMR_AF=0.0058;AFR_AF=0.0908;EUR_AF=0.001;SAS_AF=0.0164;VT=SV    GT  0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
ADD REPLYlink written 11 months ago by Minstein100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1603 users visited in the last hour