Snp Call Within A Given Region And Vcf Result
2
1
Entering edit mode
12.8 years ago
Zhshqzyc ▴ 520

Hello, I used the command to call snp. I hope to find something.

samtools mpileup -C50 -r chr2:1000000-2000000 -uf ref.fa s1.bam s2.bam s3.bam s4.bam s5.bam | bcftools view -bvcg - > var.raw.bcf   
bcftools view var.raw.bcf | vcfutils.pl varFilter -D100 > var.flt.vcf

And I found the result in vcf file is

##fileformat=VCFv4.1
##samtoolsVersion=0.1.16 (r963:234)
##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
##INFO=<ID=DP4,Number=4,Type=Integer,Description="# high-quality ref-forward bases, ref-reverse, alt-forward and alt-reverse bases">
##INFO=<ID=MQ,Number=1,Type=Integer,Description="Root-mean-square mapping quality of covering reads">
##INFO=<ID=FQ,Number=1,Type=Float,Description="Phred probability of all samples being the same">
##INFO=<ID=AF1,Number=1,Type=Float,Description="Max-likelihood estimate of the site allele frequency of the first ALT allele">
##INFO=<ID=G3,Number=3,Type=Float,Description="ML estimate of genotype frequencies">
##INFO=<ID=HWE,Number=1,Type=Float,Description="Chi^2 based HWE test P-value based on G3">
##INFO=<ID=CI95,Number=2,Type=Float,Description="Equal-tail Bayesian credible interval of the site allele frequency at the 95% level">
##INFO=<ID=PV4,Number=4,Type=Float,Description="P-values for strand bias, baseQ bias, mapQ bias and tail distance bias">
##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
##INFO=<ID=PC2,Number=2,Type=Integer,Description="Phred probability of the nonRef allele frequency in group1 samples being larger (,smaller) than in group2.">
##INFO=<ID=PCHI2,Number=1,Type=Float,Description="Posterior weighted chi^2 P-value for testing the association between group1 and group2 samples.">
##INFO=<ID=QCHI2,Number=1,Type=Integer,Description="Phred scaled PCHI2.">
##INFO=<ID=PR,Number=1,Type=Integer,Description="# permutations yielding a smaller PCHI2.">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GL,Number=3,Type=Float,Description="Likelihoods for RR,RA,AA genotypes (R=ref,A=alt)">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="# high-quality bases">
##FORMAT=<ID=SP,Number=1,Type=Integer,Description="Phred-scaled strand bias P-value">
##FORMAT=<ID=PL,Number=-1,Type=Integer,Description="List of Phred-scaled genotype likelihoods, number of values is (#ALT+1)*(#ALT+2)/2">
#CHROM    POS    ID    REF    ALT    QUAL    FILTER    INFO    FORMAT    s1    s2    s3    s4    s5

Is it correct? Nothing found? Or I used a wrong command? Thanks for advice.

samtools vcf snp sequence • 3.6k views
ADD COMMENT
3
Entering edit mode
12.8 years ago
Swbarnes2 ★ 1.6k

In general, yes, if that's what you see, then there were no SNPs. But if you think that's wrong, you can try a few things to loosen the SNP calling

First, I'd try not filtering at all, so take out the pipe to vcfutils step.

Also try making an mpileup with -Buf. I've seen the BAQ calculations destroy the quality scores of sanger-verified SNPs, such that they could not be called with bcftools. Disengaging the BAQ calculations with -B will make the quality scores match that of the sam file, which might bring back your SNPs.

ADD COMMENT
2
Entering edit mode
12.8 years ago

remove the minimum read depth (=100) for a try

bcftools view var.raw.bcf  > var.flt.vcf
ADD COMMENT

Login before adding your answer.

Traffic: 2553 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6