Question: Identification of homozygous SNP
2.5 years ago
olga_viktorovskaya

Dear Biostars Community,

I am trying to identify SNP/indels in my WGS samples and faced an issue (I am just a newbie in bioinformatics). I have sequenced yeast genomes (haploid), trimmed the reads, aligned to the reference genome, etc and then I used samtools mpileup command to call for SNP/indels:

samtools mpileup -uf REFERENCE/S288C_genome.fsa OUTPUT/", file, "_sorted.bam | bcftools view -bvcg - > OUTPUT/", file, "_var.raw.bcf"

and then write the results with the list of SNP/indels in a file

The problem is that with that script I am getting a list of 1500 SNP/indels that contain variations in the reads due to sequencing/amplification errors. How can I narrow down the list of SNPs to detect ONLY those that would reflect a real change in the haploid genome?

happy to hear your suggestions!

snp genome • 882 views
Not an answer to your question, but using GATK HaplotypeCaller you can set the expected ploidy for variant calling, which might be more accurate.

WouterDeCoster
