Question: LiftOver a VCF file
0
gravatar for Susmita Mandal
18 months ago by
Bangalore
Susmita Mandal60 wrote:

Hey all,

I have been trying to liftover a particular VCF file from GRCm38 to NCBIm37. I have used UCSC LiftOver tool, Ensembl API, CrossMap and Picard. None of them are lifting over completely. Either they are not working at all or having rejected variants. Especially in Picard LiftoverVcf, the rejected variants are those with have NoTarget in them. No idea why. The reference fasta file I am using is Mus_musculus.NCBIM37.61.dna.toplevel.fa. and the liftover chain file is GRCm38_to_NCBIM37.chain.gz

The vcf file is from:

ftp://ftp-mouse.sanger.ac.uk/current_snps/strain_specific_vcfs/129S1_SvImJ.mgp.v5.snps.dbSNP142.vcf

Any leads will be helpul. Thanks in advance

Best,

Susmita

picard crossmap liftovervcf vcf • 1.6k views
ADD COMMENTlink modified 18 months ago by RamRS26k • written 18 months ago by Susmita Mandal60

Especially in Picard LiftoverVcf, the rejected variants are those with have NoTarget in them

what is 'NoTarget' ?

ADD REPLYlink written 18 months ago by Pierre Lindenbaum127k

I have no idea. Its something like this: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 129S1_SvImJ 1 3000023 . C A 109 NoTarget CSQ=A||||intergenic_variant||||||||;DP=6;DP4=0,0,6,0 GT:GQ:DP:MQ0F:GP:PL:AN:MQ:DV:DP4:SP:SGB:PV4:FI 1/1:22:6:0.166667:152,22,0:137,18,0:2:36:6:0,0,6,0:0:-0.616816:.:1

ADD REPLYlink modified 18 months ago • written 18 months ago by Susmita Mandal60

it's the FILTER column , and should be defined in the VCF header...

ADD REPLYlink written 18 months ago by Pierre Lindenbaum127k

I just want the VCF to be lifted over!

ADD REPLYlink written 18 months ago by Susmita Mandal60

searching online would help (key words: picard liftover notarget in google): https://github.com/broadinstitute/picard/blob/master/src/main/java/picard/vcf/LiftoverVcf.java

     * Filter name to use when a target cannot be lifted over.
     */
public static final String FILTER_NO_TARGET = "NoTarget";
ADD REPLYlink modified 18 months ago • written 18 months ago by cpad011212k

NoTarget is not the main issue. Issue is why theVCF is not getting lifted completely. Is there any tool that can do?

ADD REPLYlink written 18 months ago by Susmita Mandal60

VCF from the link posted in OP is huge and gzipped vcf is ~200 mb (on http://crispor.tefor.net/genomes/mm10/orig/). It would help if you could post example records that are not lifted between the builds with headers. In general, there are always discrepancies between builds (vcf). some of the record get merged and some of the records get dropped. However this % is small, in consecutive builds.

ADD REPLYlink modified 18 months ago • written 18 months ago by cpad011212k

I dont think I can copy that many lines here.

ADD REPLYlink written 18 months ago by Susmita Mandal60
2
gravatar for Emily_Ensembl
18 months ago by
Emily_Ensembl20k
EMBL-EBI
Emily_Ensembl20k wrote:

Genome assemblies do not 100% map to one another. Newer assemblies will have novel regions that were not found in the older assemblies, and older assemblies will have incorrectly assembled regions that cannot easily be mapped across to the correctly assembled regions. If the variants were called on loci in GRCm38 that did not have coverage in NCBIM37, then there will be no mapping.

Why do you want to map your VCFs back to an old assembly? Would it not be better to map your other data forward onto the newer assembly?

ADD COMMENTlink written 18 months ago by Emily_Ensembl20k

or use VCFs relevant to that build/assembly.

ADD REPLYlink written 18 months ago by cpad011212k

I took your advice. Earlier I couldn't find VCFs of 129S1/Sv mapped onto mm9. I found some files though last night, its a tab delimited file with #CHROM POS REF 129S1/Sv and a tbi file. I have no idea how to get the vcf file from these two. Any ideas?

ADD REPLYlink modified 18 months ago • written 18 months ago by Susmita Mandal60

I am creating custom in silico parental genomes and for that i would need VCFs from both the parents mapped onto the same reference genome

ADD REPLYlink written 18 months ago by Susmita Mandal60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1063 users visited in the last hour