I am analyzing my ddRADseq data from a diploid organism using Freebayes. In my case, fragments 450-550bp were sequenced with 300bp paired-end reads. My goal is to figure out the two haplotypes of each sample.
I tried to follow the method below (From BioStars provided by Erik). But I did't quite understand what "--max-complex-gap" does. It seems "--max-complex-gap" is limited by the length of the reads, does that mean I can set it as high as 300 in my case? Also I was wondering what "--haplotype-length" does?
Since ddRADseq does not provide the information across whole genome, I also wonder how Freebayes calls a haplotype across whole genome or across different chromosomes?
Hope someone could help me out. I would really appreciate it. Thank you in advance!
freebayes --max-complex-gap 0 --ploidy 2 input.bam | vcffilter -f "QUAL > 20" >high_confidence.vcf bgzip high_confidence.vcf tabix -p vcf high_confidence.vcf.gz freebayes --max-complex-gap 200 --ploidy 2 --haplotype-basis-alleles high_confidence.vcf.gz input.bam >haplotypes.vcf