Question: Getting Haplotypes from a VCF file
0
gravatar for gkuffel22
4 days ago by
gkuffel2230
United States
gkuffel2230 wrote:

Hi everyone,

I am analying a data set for a user interested in looking at variants in the MHC gene of coyotes. I have used Freebayes to generate a VCF file. The user would now like haploytypes, does anyone know how to decipher that info from a file such as this vcf:

https://docs.google.com/spreadsheets/d/1rH-6Q8559C70QeINoOOwCj8SCRA2U-AYr9NSb-Qefs4/edit?usp=sharing

freebayes haplotypes vcf • 77 views
ADD COMMENTlink written 4 days ago by gkuffel2230

Ummm, why did you give us a spreadsheet of a VCF file?

ADD REPLYlink written 3 days ago by Ram13k

Ummm, because any VCF file can be viewed as a csv or xls file. I provided this format to allow for easy viewing. If it would help anyone to have the actual VCF file I can easily provide that upon request. I wanted to know if I can get haplotypes from the data presented in the spreadsheet view of the VCF file.

I believe I found my own answer. I need to re-run FreeBayes and enable the parameter -E --max-complex-gap. Otherwise Freebayes only calculates haplotypes within a 3 bp distance. I am trying to calculate haploypes over a 250 bp amplicon.

ADD REPLYlink written 3 days ago by gkuffel2230

My question was, people here know what a VCF file looks like. You could have provided a code block or a gist if you wished to exhibit an excerpt. Why use a spreadsheet?

Also, "any VCF can be viewed as a spreadsheet": Nope, don't do that. Excel is not meant to handle that data volume or that kind of data.

ADD REPLYlink modified 3 days ago • written 3 days ago by Ram13k

So I should paste like this for next time?

#CHROM  POS     ID  REF     ALT     QUAL    FILTER  INFO    FORMAT  10  110_S24_L001    111_S25_L001    115_S26_L001    118_S27_L001    11_S5_L001  120_S28_L001    121_S29_L001    123_S30_L001    124_S31_L001    12$
gi|1209890|gb|U47338.1|CFMHCDRB02   7   .   C   G   0   .   AB=0.223249;ABP=867.188;AC=63;AF=0.203226;AN=310;AO=386;CIGAR=1X;DP=15213;DPB=15213;DPRA=3.99182;EPP=815.344;EPPR=26084.5;GTI=16;LEN=1;MEA$
gi|1209890|gb|U47338.1|CFMHCDRB02   28  .   TTGGA   GTGAG,GTGAA     9.25207e+06     .   AB=0.415709,0.037858;ABP=75093.2,2.2572e+06;AC=51,4;AF=0.132812,0.0104167;AN=384;AO=530334,47952;CIGAR=1X2M2X,1X2M1X1M;DP=$
ADD REPLYlink modified 3 days ago by Ram13k • written 3 days ago by gkuffel2230

Do you have any information on my actual question about haplotypes?

ADD REPLYlink written 3 days ago by gkuffel2230

Unfortunately, no. Maybe if you could explain the purpose behind this exercise, we could get some clarity.

ADD REPLYlink written 3 days ago by Ram13k

Almost. I've corrected it so it's better. You can use the formatting bar (especially the code option) to present your post better. Formatting bar

ADD REPLYlink written 3 days ago by Ram13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 509 users visited in the last hour