Dear Biostars Community,
I am trying to identify SNP/indels in my WGS samples and faced an issue (I am just a newbie in bioinformatics). I have sequenced yeast genomes (haploid), trimmed the reads, aligned to the reference genome, etc and then I used samtools mpileup command to call for SNP/indels:
samtools mpileup -uf REFERENCE/S288C_genome.fsa OUTPUT/", file, "_sorted.bam | bcftools view -bvcg - > OUTPUT/", file, "_var.raw.bcf"
and then write the results with the list of SNP/indels in a file
The problem is that with that script I am getting a list of 1500 SNP/indels that contain variations in the reads due to sequencing/amplification errors. How can I narrow down the list of SNPs to detect ONLY those that would reflect a real change in the haploid genome?
happy to hear your suggestions!