Question: Identification of homozygous SNP
gravatar for olga_viktorovskaya
2.5 years ago by
olga_viktorovskaya0 wrote:

Dear Biostars Community,

I am trying to identify SNP/indels in my WGS samples and faced an issue (I am just a newbie in bioinformatics). I have sequenced yeast genomes (haploid), trimmed the reads, aligned to the reference genome, etc and then I used samtools mpileup command to call for SNP/indels:

samtools mpileup -uf REFERENCE/S288C_genome.fsa OUTPUT/", file, "_sorted.bam | bcftools view -bvcg - > OUTPUT/", file, "_var.raw.bcf"

and then write the results with the list of SNP/indels in a file

The problem is that with that script I am getting a list of 1500 SNP/indels that contain variations in the reads due to sequencing/amplification errors. How can I narrow down the list of SNPs to detect ONLY those that would reflect a real change in the haploid genome?

happy to hear your suggestions!

snp genome • 882 views
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by olga_viktorovskaya0

Not an answer to your question, but using GATK HaplotypeCaller you can set the expected ploidy for variant calling, which might be more accurate.

ADD REPLYlink written 2.5 years ago by WouterDeCoster38k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1464 users visited in the last hour