Question: samtools mpileup not calling small (6 base) INSERT
gravatar for ALchEmiXt
3.8 years ago by
The Netherlands
ALchEmiXt1.9k wrote:

Hi All,

I have probably a simple problem but which I cannot solve as of yet and puzzles me quite a bit. Using samtools mpileup and bcftools I am unable to call a small 6 base INSERT compared to my reference. Single SNPs are called ok. I have looked at many unanswered questions and potential answers here at BioStar and this question/answer gets closest but none actually has a working answer!

My play aorund data:

  • Illumina PE 250 dataset of bacterial genome (2Mb at coverage 5k+)
  • Assembled and reconstructed a genome from it
  • If I check for SNPs by mapping the raw reads to the fresh assembly none are found (underlining the correct assembly)


  • I created an artificial 2 position SNP and somewhere else I deleted 6 bases in the assembly
  • I mapped the original reads which should detect the SNPs and the INSERT. Hoewever I only see the SNPs in the vcf and not the INSERT

My mapping is done with bowtie2 --local-sensitive settings to the ArtificialREFERENCE-6del sequence

Sam file is converted to bam and sorted using samtools

Then I call variants:

samtools mpileup -uf ArtificialREFERENCE-6del.fasta SORTED_BAMfile.bam -o SAMPLE_file.bcf

bcftools call -vm -o SAMPLE_file.vcf SAMPLE_file.bcf

I tried different bcftools settings like -c and -p 0.5 ... no luck...

Any help would be appreciated! It is probably something very obvious but I fail to see it from the documentation.

NB: I tried samtools/bcftools version 1.3.1 but also an older 1.1

NB2: I tried the additional options and FLAGS as proposed in the above linked post without luck.

NB3: The below is an IGV snippet of the involved BAM file clearly showing there IS a 6 base insert. IGV snippet BAM of 6base INSERT

ADD COMMENTlink modified 3.6 years ago by sb10 • written 3.8 years ago by ALchEmiXt1.9k

No one sees this behaviour?

I can confirm that it not only occurs with my data but also with at least another case when mapping/consensus calling viral sequences!

ADD REPLYlink written 3.8 years ago by ALchEmiXt1.9k
gravatar for sb1
3.6 years ago by
UK, Manchester, CMFT
sb10 wrote:


Did you find out why mpileup was unable to see the event? I seem to be having the same problem but with human data.

ADD COMMENTlink written 3.6 years ago by sb10

If you know a specific example in your data, please try to pileup the surrounding portion with samtools mpileup --output-tags AD,DP -uf $REFERENCE -r $REGION_OF_INTEREST $BAMFILE | bcftools call -m with $REGION_OF_INTEREST begin something like chr1: 10000-10050 if your expected indel is at chr1:10025, i.e., 25 bases up- and downstream of the expected position.

This command skips the -v, --variants-only to have all alternative alleles to be listed. If the insertion is listed, please check the genotype. My guess is, that the genotype is "0/0", which - from my understanding - suggests that samtools statistical methods infer the sample to be wild-type, i.e., not contain the insertion. If this is the case you will have to dig deep into the statistical methods of variant calling or ask the authors of samtools ;-)

If the insertion does not appear even you left away the -v, then there is probably some problem with the data or some bug.

ADD REPLYlink modified 3.6 years ago • written 3.6 years ago by Manuel Landesfeind1.2k

will check that and post back later.

ADD REPLYlink written 3.5 years ago by ALchEmiXt1.9k

no not yet. I had to give it a rest for a while, but it still puzzles me.

ADD REPLYlink written 3.5 years ago by ALchEmiXt1.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1671 users visited in the last hour