Deleted:Losing 15% of SNPs during Liftover with" PicardLiftover VCF" or "CrossMap"
Entering edit mode
21 months ago
arctic ▴ 40

I am new to the field, I am trying to liftover my genotyping data for a GWAS from hg38 to hg19. When using PicardLiftover VCF or CrossMap on my SNPs prior to QC, I am losing ~15% of my total SNPs during liftover mainly due to "mismatching reference alleles". The rate goes higher after basic QC of SNPs (listed below). Any advice on if the observed liftover rate is acceptable ( and if not where would be a good place to start troubleshooting) is very much appreciated. Further details are below, if further information is needed please let me know. Many thanks beforehand for your time and advices.


Genotyping Platform:

Infinium Global Screening Array to obtain around 600K human variants.

% Variants lost during liftover

When using PicardLiftover VCF on the VCF of our data without QC, I lose ~16%of my SNPs:
- 13% are "variants lifted over but had mismatching reference alleles after lift over."
- 3% are "variants failed to liftover"

Liftover rate after basic QC:

If I apply liftover after some basic QC (listed below) the failure rate still remains high; all dominantly due to " mismatching reference":
1. Post Missingness filter of 0.02 for SNPs and Samples : 15% lost of ~600K variants
2. Post MAF (0.05) and autosomal SNP filter: 27% lost of ~250K variants

The options used for Picard:


java -jar ./picard.jar CreateSequenceDictionary REFERENCE=./hg19.fa OUTPUT=./hg19.dict


java -jar LiftoverVcf -I ./Myhg38.vcf -O ./Myhg19.vcf -CHAIN ./hg38ToHg19.over.chain.gz -REJECT ./rejected.vcf -R ./hg19.fa
picard liftover CrossMap crossmap gwas • 993 views
This thread is not open. No new answers may be added
Traffic: 892 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6