Question: samtools mpileup not calling small (6 base) INSERT
gravatar for ALchEmiXt
10 months ago by
The Netherlands
ALchEmiXt1.8k wrote:

Hi All,

I have probably a simple problem but which I cannot solve as of yet and puzzles me quite a bit. Using samtools mpileup and bcftools I am unable to call a small 6 base INSERT compared to my reference. Single SNPs are called ok. I have looked at many unanswered questions and potential answers here at BioStar and this question/answer gets closest but none actually has a working answer!

My play aorund data:

  • Illumina PE 250 dataset of bacterial genome (2Mb at coverage 5k+)
  • Assembled and reconstructed a genome from it
  • If I check for SNPs by mapping the raw reads to the fresh assembly none are found (underlining the correct assembly)


  • I created an artificial 2 position SNP and somewhere else I deleted 6 bases in the assembly
  • I mapped the original reads which should detect the SNPs and the INSERT. Hoewever I only see the SNPs in the vcf and not the INSERT

My mapping is done with bowtie2 --local-sensitive settings to the ArtificialREFERENCE-6del sequence

Sam file is converted to bam and sorted using samtools

Then I call variants:

samtools mpileup -uf ArtificialREFERENCE-6del.fasta SORTED_BAMfile.bam -o SAMPLE_file.bcf

bcftools call -vm -o SAMPLE_file.vcf SAMPLE_file.bcf

I tried different bcftools settings like -c and -p 0.5 ... no luck...

Any help would be appreciated! It is probably something very obvious but I fail to see it from the documentation.

NB: I tried samtools/bcftools version 1.3.1 but also an older 1.1

NB2: I tried the additional options and FLAGS as proposed in the above linked post without luck.

NB3: The below is an IGV snippet of the involved BAM file clearly showing there IS a 6 base insert. IGV snippet BAM of 6base INSERT

ADD COMMENTlink modified 7 months ago by sb10 • written 10 months ago by ALchEmiXt1.8k

No one sees this behaviour?

I can confirm that it not only occurs with my data but also with at least another case when mapping/consensus calling viral sequences!

ADD REPLYlink written 10 months ago by ALchEmiXt1.8k
gravatar for sb1
7 months ago by
UK, Manchester, CMFT
sb10 wrote:


Did you find out why mpileup was unable to see the event? I seem to be having the same problem but with human data.

ADD COMMENTlink written 7 months ago by sb10

If you know a specific example in your data, please try to pileup the surrounding portion with samtools mpileup --output-tags AD,DP -uf $REFERENCE -r $REGION_OF_INTEREST $BAMFILE | bcftools call -m with $REGION_OF_INTEREST begin something like chr1: 10000-10050 if your expected indel is at chr1:10025, i.e., 25 bases up- and downstream of the expected position.

This command skips the -v, --variants-only to have all alternative alleles to be listed. If the insertion is listed, please check the genotype. My guess is, that the genotype is "0/0", which - from my understanding - suggests that samtools statistical methods infer the sample to be wild-type, i.e., not contain the insertion. If this is the case you will have to dig deep into the statistical methods of variant calling or ask the authors of samtools ;-)

If the insertion does not appear even you left away the -v, then there is probably some problem with the data or some bug.

ADD REPLYlink modified 7 months ago • written 7 months ago by Manuel Landesfeind1.0k

will check that and post back later.

ADD REPLYlink written 7 months ago by ALchEmiXt1.8k

no not yet. I had to give it a rest for a while, but it still puzzles me.

ADD REPLYlink written 7 months ago by ALchEmiXt1.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 918 users visited in the last hour