VCF file from PHG for loading into BreedBase
1
0
Entering edit mode
8 months ago
clb343 • 0

How do I generate a VCF file from the Practical Haplotype Graph that includes the haplotypes? Isaak has a VCF file for Cassava that was generated by Buckler Lab but he doesn't have the code to create the file

PracticalHaplotypeGraph PHG • 463 views
ADD COMMENT
0
Entering edit mode
8 months ago
lcj34 ▴ 420

I think you're looking for the PathsToVCFHaplotypesPlugin. This exports diploid or haploid paths to a VCF file with haplotype allele values (Not SNPs).

The VCF file is created by first calling HaplotypeGraphBuilderPlugin to create a graph that includes haplotypes based on the user specified methods. This graph is passed along with a PATH method name and optional list of taxa to the ImportDiploidPathPlugin. The ImportDiploidPathPlugin returns the graph along with a map of haplotype paths. Finally, the data from the ImportDiploidPathPlugin output is sent as input to the PathsToVCFHaplotypesPlugin.

Note that running the ImportDiploidPathPlugin is optional. If this step isn't run, the paths are created from the haplotypes in the graph.

When running the PathsToVCFPlugin or PathsToVCFHaplotypesPlugin we recommend using a positions list to limit the number of entries in the output VCF File to something manageable. The positions list can be specified by Genotype file (i.e. VCF, Hapmap, etc.), bed file, or json file containing the requested positions.

An example of chaining these plugins together to get a VCF with haplotypes is below. Replace the parameter values shown here surrounded by < > with your own parameter names. There are other optional parameters to these methods, but below is a basic command

 docker run --name pipeline_container --rm -v <baseDir>:/phg/ -t <dockerImageName>  \
       /tassel-5-standalone/run_pipeline.pl -Xmx200G -debug -configParameters <configFile> \
       -HaplotypeGraphBuilderPlugin -configFile <configFile> -methods <haplotypeMethod1> \
             -includeVariantContexts true -includeSequences false  -taxa <taxa1, taxa2> -endPlugin \
       -ImportDiploidPathPlugin -pathMethodName <pathMethod1> -endPlugin \
       -PathsToVCFHaplotypesPlugin -outputFile <vcfOutputFile> -referenceFasta <referenceFasta.fa> \
             -positions <positions> -endPlugin
ADD COMMENT
0
Entering edit mode

Thanks I have the PathsToVCFHaplotypesPlugin working. Now I have to decide if Breedbase is the best way to display that data. It might be better to represent data in JBrowse or an R Shiny app.

ADD REPLY

Login before adding your answer.

Traffic: 2934 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6