Best pipeline for calling SNPs in a bi-parental haploid population?
0
0
Entering edit mode
13 days ago

Hi,

I have 150 bp paired-end illumina reads for 120 haploid isolates derived from a bi-parental cross. The genome sizes of the two parents are ~50 mb, and i have, on average, > 100x coverage for the 120 progeny isolates. I want to use this data to identify SNPs in the population, which will then be used to create genetic maps and perform QTL analysis.

I want to know if there are any best practices for calling SNPs from a biparental population, or if there are any bioinformatic pipelines that would be ideal for this. I have already tried using bcftools mpileup and bcftools call:

bcftools mpileup -Ou -f genomic.fasta *.bam | bcftools call -mv -Ob --ploidy 1 --threads 4 -o calls1.bcf

and while I was able to generate and map markers, I'm not sure how much confidence I should have in the data. I found that even after filtering for metrics such as depth, quality, DP4, MQ, SCBZ, etc. I still have many calls that indicate double-recombination events in single progeny, or double recombination events in multiple progeny but only favoring the reference allele. I can minimize these by using more and more stringent filters, but I fear that I am eliminating "good" markers in the process. Also, many of the metrics in my VCF files that I can use for filtering appear to be based on a natural population- probably more geared towards use in GWAS.

Any recommendations or advice?

SNP biparental illumina polymorphism bi-parental • 171 views
ADD COMMENT

Login before adding your answer.

Traffic: 2492 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6