How to choose LiftOver chain file
1
0
Entering edit mode
6 months ago
ttom ▴ 220

I am trying to liftover a hg38 Whole Genome Sequenced VCF to hg19 VCF. Planning to use GATK Picard for this. However not sure which liftover chain file to use from this path:

hg38tohg19 picard LiftOver • 940 views
ADD COMMENT
2
Entering edit mode
6 months ago
ATpoint 82k

hg38ToHg19.over.chain.gz which, as by the name, lifts hg38 to hg19.

The file names reflect the assembly conversion data contained within in the format <db1>To<Db2>.over.chain.gz. For example, a file named hg15ToHg16.over.chain.gz file contains the liftOver data needed to convert hg15 (Human Build 33) coordinates to hg16 (Human Build 34).

ADD COMMENT
1
Entering edit mode

You can use the UCSC liftover chain file:

wget http://hgdownload.cse.ucsc.edu/goldenpath/hg38/liftOver/hg38ToHg19.over.chain.gz

Or the Ensembl chain file:

wget http://ftp.ensembl.org/pub/assembly_mapping/homo_sapiens/GRCh38_to_GRCh37.chain.gz

But notice they are not equivalent and the UCSC chain file usually covers more bases while the Ensembl chain file includes chains for contigs with and without the chr prefix

Also notice that you will not be able to use either chain files to liftover to GRCh37 contigs without the chr prefix with Picard/LiftoverVcf, as you would need something like hg38ToB37.over.chain.gz instead. If you want to avoid worrying about contig names, you can use BCFtools/liftover available here (binaries available here) which will work seamlessly with any chain file regardless of the contigs name format

ADD REPLY
0
Entering edit mode

Good edit @zx8754, thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6