Question: Alignment & Variant Calling explanation SNP and SNV.
0
gravatar for ernestrv0101
7 months ago by
USA
ernestrv01010 wrote:

As a beginner, I have a basic question about a bam alignment. After map my fastq reads (from a single individual) to a reference (bwa), I can see the variations, which I guess it includes sequencing errors, misalignment, errors in library preparation and real SNPs. In an haploid organism, I suppose there is only one possible correct result for each position so there is only one correct consensus. After do variant calling with samtools:

samtools mpileup -uf params bam | bcftools call -mv -Oz -o vcf

and with lofreq:

lofreq call -f ref -o outvcf mybam

I obtain two vcf with SNP's and SNV's (bigger than the SNP's file, as expected). How these programs mark a variation as a SNP or SNV if I am working with only one sample? The definition of SNP is, from wikipedia:

A variation in a single nucleotide that occurs at a specific position in the genome, where each variation is present to some appreciable degree within a population (e.g. > 1%)

Thanks!

ADD COMMENTlink modified 6 months ago by Biostar ♦♦ 20 • written 7 months ago by ernestrv01010

Why do you think you have one file with "SNP" and one with "SNV"?

"SNP" stands for "SIngle Nucleotide Polymorphism". And "SNV" for "Single Nucleotide Variation".

The term SNP is more often used in talks. The problem is that "polymorphism" implice that the change in sequence is quite often and have little or no impact on the gene function. But people started to use this term for almost every change in sequence even for those which have influence.

So to avoid the irritation, whether there is an impact or not, SNV is a much better word.

fin swimmer

ADD REPLYlink written 7 months ago by finswimmer5.4k

In which organism you work ? http://csb5.github.io/lofreq/commands/#call if you look this page it seems they use dbsnp to call SNP for human by default :

If you are dealing with human samples (or large genomes in general) we recommend the use of -s (source quality) in combination with -S dbsnp.vcf.gz

ADD REPLYlink modified 7 months ago • written 7 months ago by Titus720

@finswimmer OK so leaving aside the definition of SNP or SNV, they just extract the variations against a reference (identifying and discarding the possible errors, misalignments, etc). I was confused with the SNP's definition. Thanks!

It is an insect but thanks anyway @Titus!

ADD REPLYlink modified 6 months ago • written 6 months ago by ernestrv01010

Cockroach? - Periplaneta spp.?

Just be wary of dbSNP - it is a grand mix of 'common' and 'rare' variants, many of which have clinical relevance and are even listed in ClinVar as pathogenic alleles.

ADD REPLYlink modified 6 months ago • written 6 months ago by Kevin Blighe28k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 562 users visited in the last hour