Question: Alignment & Variant Calling explanation SNP and SNV.
gravatar for ernestrv0101
21 months ago by
ernestrv01010 wrote:

As a beginner, I have a basic question about a bam alignment. After map my fastq reads (from a single individual) to a reference (bwa), I can see the variations, which I guess it includes sequencing errors, misalignment, errors in library preparation and real SNPs. In an haploid organism, I suppose there is only one possible correct result for each position so there is only one correct consensus. After do variant calling with samtools:

samtools mpileup -uf params bam | bcftools call -mv -Oz -o vcf

and with lofreq:

lofreq call -f ref -o outvcf mybam

I obtain two vcf with SNP's and SNV's (bigger than the SNP's file, as expected). How these programs mark a variation as a SNP or SNV if I am working with only one sample? The definition of SNP is, from wikipedia:

A variation in a single nucleotide that occurs at a specific position in the genome, where each variation is present to some appreciable degree within a population (e.g. > 1%)


ADD COMMENTlink modified 20 months ago by Biostar ♦♦ 20 • written 21 months ago by ernestrv01010

Why do you think you have one file with "SNP" and one with "SNV"?

"SNP" stands for "SIngle Nucleotide Polymorphism". And "SNV" for "Single Nucleotide Variation".

The term SNP is more often used in talks. The problem is that "polymorphism" implice that the change in sequence is quite often and have little or no impact on the gene function. But people started to use this term for almost every change in sequence even for those which have influence.

So to avoid the irritation, whether there is an impact or not, SNV is a much better word.

fin swimmer

ADD REPLYlink written 21 months ago by finswimmer12k

In which organism you work ? if you look this page it seems they use dbsnp to call SNP for human by default :

If you are dealing with human samples (or large genomes in general) we recommend the use of -s (source quality) in combination with -S dbsnp.vcf.gz

ADD REPLYlink modified 21 months ago • written 21 months ago by Titus900

@finswimmer OK so leaving aside the definition of SNP or SNV, they just extract the variations against a reference (identifying and discarding the possible errors, misalignments, etc). I was confused with the SNP's definition. Thanks!

It is an insect but thanks anyway @Titus!

ADD REPLYlink modified 21 months ago • written 21 months ago by ernestrv01010

Cockroach? - Periplaneta spp.?

Just be wary of dbSNP - it is a grand mix of 'common' and 'rare' variants, many of which have clinical relevance and are even listed in ClinVar as pathogenic alleles.

ADD REPLYlink modified 20 months ago • written 20 months ago by Kevin Blighe51k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1637 users visited in the last hour