Question: freebayes haplotype calling (phased variants)
0
gravatar for mark.rose
2.2 years ago by
mark.rose10
United States
mark.rose10 wrote:

Hi All

I often use freebayes for variant calling and am pleased with the results. Now I want to use it to call haplotypes. The nature of my project is that I wish to phase variants in 400-450bp amplicons. These amplicons are generated from a mixed target source such that different targets within it will generate amplicons of slightly different sequence though different targets may share one or more specific variants. My plan was to subject these amplicons to 2X300 bp Illumina sequencing, merge read 1 and read 2, then align to a reference. The freebayes documentation indicates that the “-E” flag is required to call phased variants, and so I’ve included “-E 400” in my command line. I then tested this against simulated reads I made that contain a 2bp and 3bp deletion 11bp apart. While both deletions are called by freebayes I see nothing that indicates they are phased. I had read online that I might expect these two variants to be depicted by freebayes as a single "complex" variant bound by the 2 deletions. I do not. Moreover, in the header I see the following: "##phasing=none". Am I mistaken about how freebayes reports phased variants or is there something more to the proper command line that I am missing?

Thanks

Mark

Example:

freebayes version: v1.0.2-14-g9e98366

Code:

$ freebayes -F 0.1 -E 400 -f target.ref.fa -v outE400F0.10.new  out.ref.sort.bam

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  unknown
target  186     .       TAGAGGCTGCGTACGGGTAACACGTCGTAAGGAGCGATGACGCGGGCACTGCTCCGGCGTCGTCGA  TAGAGGCTGCGTACGGGTAACACGTCGTAAGGAGCGATGACGCGGGCACTGCTCCGGCGTCGTA      2403.38 .       AB=0;ABP=0;AC=2;AF=1;AN=2;AO=280;CIGAR=63M2D1M;DP=993;DPB=962.909;DPRA=0;EPP=4.53033;EPPR=0;GTI=0;LEN=2;MEANALT=452;MQM=17;MQMR=0;NS=1;NUMALT=1;ODDS=388.578;PAIRED=0;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=4320;QR=0;RO=0;RPL=280;RPP=611.023;RPPR=0;RPR=0;RUN=1;SAF=147;SAP=4.53033;SAR=133;SRF=0;SRP=0;SRR=0;TYPE=del     GT:DP:DPR:RO:QR:AO:QA:GL        1/1:993:993,280:0:0:280:4320:-327.573,-84.2884,0

target  261     .       TGAGGTGGAGGCGA  TGAGGTGGAGA     7515.8  .       AB=0;ABP=0;AC=2;AF=1;AN=2;AO=787;CIGAR=1M3D10M;DP=981;DPB=775.071;DPRA=0;EPP=5.02174;EPPR=0;GTI=0;LEN=3;MEANALT=37;MQM=17;MQMR=0;NS=1;NUMALT=1;ODDS=1091.43;PAIRED=0;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=13337;QR=0;RO=0;RPL=787;RPP=1711.96;RPPR=0;RPR=0;RUN=1;SAF=380;SAP=5.02174;SAR=407;SRF=0;SRP=0;SRR=0;TYPE=del     GT:DP:DPR:RO:QR:AO:QA:GL      1/1:981:981,787:0:0:787:13337:-991.985,-236.911,0
ADD COMMENTlink modified 3 months ago by gkuffel2250 • written 2.2 years ago by mark.rose10
0
gravatar for gkuffel22
3 months ago by
gkuffel2250
United States
gkuffel2250 wrote:

Did you ever figure this out? I also am trying to get haploytpes from 450 bp amplicons.

ADD COMMENTlink written 3 months ago by gkuffel2250

As I recall I was able to get some haplotype calling but it was erratic and failed to accurately represent simulated data. Your objective seems much like mine was. What I wound up doing was to write a pipeline which would analyze read pairs from amplicons individually and then identify sets of variants that occur in phase on these individual amplicons above a set minimum frequency threshold, thus eliminating PCR/sequencing errors present in individual reads. This was more reliable.

ADD REPLYlink written 3 months ago by mark.rose10

Thank you for the reply. I have 250 bp paired end reads and my user (I work for a core) is interested in haplotypes that occurr in the MHC gene in coyotes. Would you be willing to share your script or provide any additional insight?

ADD REPLYlink written 3 months ago by gkuffel2250

As I work for a private company, I cannot share my code with you. Sorry. If you can code or have access to someone who can, I could describe it to you a bit more and you could attempt to create your own version.

ADD REPLYlink written 3 months ago by mark.rose10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1150 users visited in the last hour