Question: Alignment & Variant Calling explanation SNP and SNV.
gravatar for ernestrv0101
2.6 years ago by
ernestrv01010 wrote:

As a beginner, I have a basic question about a bam alignment. After map my fastq reads (from a single individual) to a reference (bwa), I can see the variations, which I guess it includes sequencing errors, misalignment, errors in library preparation and real SNPs. In an haploid organism, I suppose there is only one possible correct result for each position so there is only one correct consensus. After do variant calling with samtools:

samtools mpileup -uf params bam | bcftools call -mv -Oz -o vcf

and with lofreq:

lofreq call -f ref -o outvcf mybam

I obtain two vcf with SNP's and SNV's (bigger than the SNP's file, as expected). How these programs mark a variation as a SNP or SNV if I am working with only one sample? The definition of SNP is, from wikipedia:

A variation in a single nucleotide that occurs at a specific position in the genome, where each variation is present to some appreciable degree within a population (e.g. > 1%)


ADD COMMENTlink modified 2.6 years ago by Biostar ♦♦ 20 • written 2.6 years ago by ernestrv01010

Why do you think you have one file with "SNP" and one with "SNV"?

"SNP" stands for "SIngle Nucleotide Polymorphism". And "SNV" for "Single Nucleotide Variation".

The term SNP is more often used in talks. The problem is that "polymorphism" implice that the change in sequence is quite often and have little or no impact on the gene function. But people started to use this term for almost every change in sequence even for those which have influence.

So to avoid the irritation, whether there is an impact or not, SNV is a much better word.

fin swimmer

ADD REPLYlink written 2.6 years ago by finswimmer13k

In which organism you work ? if you look this page it seems they use dbsnp to call SNP for human by default :

If you are dealing with human samples (or large genomes in general) we recommend the use of -s (source quality) in combination with -S dbsnp.vcf.gz

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Titus910

@finswimmer OK so leaving aside the definition of SNP or SNV, they just extract the variations against a reference (identifying and discarding the possible errors, misalignments, etc). I was confused with the SNP's definition. Thanks!

It is an insect but thanks anyway @Titus!

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by ernestrv01010

Cockroach? - Periplaneta spp.?

Just be wary of dbSNP - it is a grand mix of 'common' and 'rare' variants, many of which have clinical relevance and are even listed in ClinVar as pathogenic alleles.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Kevin Blighe66k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1077 users visited in the last hour