Phasing vcf file with Beagle
0
0
Entering edit mode
3.1 years ago
sixthirty • 0

Hello all,

I tried to phase my vcf files with Beagle. All the genotypes seem to be phased but the INFO field containing annotations of interest disappeared and I want to keep it. How can I do that ? In case it's not possible to keep the INFO field with Beagle, is there a way to fuse my new phased vcf file with the old one in order to add the INFO field ?

Here's the command line I used: java -Xmx2g -jar /home/Softs/beagle/beagle.27Apr20.b81.jar gt=out.vcf out=phased_beagle

Also, I didn't specify a reference panel or a map. Is it necessary and why ? I don't find their manuals/instructions clear enough.

Thank you for your help!

vcf phasing beagle haplotype bash • 5.2k views
ADD COMMENT
0
Entering edit mode

You should always specify a map (if you are working on humans). You should usuaully use a reference panel as well if you are working with human data. What's your sample size / species / number of snps?

ADD REPLY
0
Entering edit mode

Thank you for your answer. I am working on human data. My vcf file contains variants for a family of 4 individuals based on the GRCh37 genome and contains around 431000 variants.

ADD REPLY
0
Entering edit mode

OK - seems like you want to do trio-phasing then (i.e. using the information from the mother and father to phase the offspring). Is that right? If so, Beagle doesn't do trio phasing - you would need to use something else like WhatsHap for that.

ADD REPLY
0
Entering edit mode

I tried to phase with WhatsHap but only part of my heterozygous genotypes are phased. Also, I don't have the bam files for all the families so I am looking for a way to phase my vcf files without having to use bam files.

Are you sure that trio phasing isn't possible with Beagle ? This documentation about Beagle seems to imply that trio phasing is possible with Beagle: (http://faculty.washington.edu/browning/beagle/beagle_3.3.2_31Oct11.pdf) "Beagle can perform haplotype phase inference and missing data imputation using data from unrelated individuals, parent-offspring trios, parent-offspring pairs, and phase-known haplotypes."

ADD REPLY
0
Entering edit mode

Yes trio-phasing will only work on some heterozygous positions, not all. You will need to do statistical phasing (i.e. using a reference panel) if you want to phase all variants. That will involve downloading a reference dataset like the 1000 genomes.

why are you trying to phase them - is there something particular you are aiming to do?

ADD REPLY
0
Entering edit mode

Hi, I'm working on the same project. What we're trying to achieve is to link denovo variants with herited ones. For that we need to get haplotypes to know on which allele is the denovo, so as we understood it, phasing is necessary. Because of some network issues, we're having a hard time getting BAM files for the vcf we got but everything we searched for phasing requiers those files. Do you think of any other way? Thank you

ADD REPLY
0
Entering edit mode

Hi, I'm doing the same thing as you,calling De Novo Muation and find their source by phasing. Have you found any good way

ADD REPLY

Login before adding your answer.

Traffic: 1752 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6