VCF from GRCH37 to GRCH38: how to manage reference differences
Entering edit mode
13 months ago
jpuntomarcos ▴ 40

Imagine you have a VCF with variants annotated for the GRCH37 assembly. Then, you want to convert these variants to the GRCH38 genome. Of course, new coordinates can be obtained using liftover and ref/alt will remain the same where GRCH37 and GRCH38 sequences are equal. However, what if the sequence is different between assemblies?

For example, How would you convert from variant 16:23625463 A>T (GRCH37) to GRCH38? At the liftovered position, 16:23614142, the reference is T. Obviously, something like 16:23614142T>T does not make sense.

  • Would you convert it as 16:23614142T>A (if the variant was heterozygous)? If so, even all GRCH37-GRCH38 sequence differences where a variant was not found in the VCF, are actually a GRCH38 variant.

My suspicion is that variants cannot be converted between genome assemblies if sequence difference exists.

grch37 variants liftover grch38 • 947 views
Entering edit mode
13 months ago

f course, new coordinates can be obtained using liftover

use gatk it will produce a VCF with the variants that cannot be lift-overed.


Login before adding your answer.

Traffic: 1290 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6